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FOREWORD 


The  Thirty-Fifth  Conference  on  the  Design  of  Experiments  in  Army 
Research,  Development  and  Testing  had  as  its  host  the  TRADOC  Test 
and  Experimentation  Command,  Experimentation  Center  (TEC) ,  Fort 
ord,  California.  This  conference  was  planned  for  18-20  October 
1989,  and  was  held  in  the  Monterey  Beach  Hotel,  Monterey,  CA.  The 
earthquake  on  17  October  prevented  several  of  the  speakers  from 
attending  this  meeting;  and  while  the  power  was  off,  problems  arose 
for  many  of  the  speakers.  Dr.  Marion  Bryson,  Director  of  TEC, 
served  as  local  host  and  conference  coordinator.  He  and  members  of 
his  staff  are  to  be  commended  for  supplying  innovative  and 
immediate  solutions  to  many  problems  associated  with  the  quake. 
Without  their  support  the  conference  would  never  have  succeeded. 

The  Army  Mathematics  Steering  Committee  (AMSC)  is  the  sponsor  of 
the  Conference  on  the  Design  of  Experiments.  Members  of  this 
committee  would  like  to  thank  D.  Hue  McCoy,  TRADOC  Analysis 
Command,  for  organizing  the  Special  Session  on  "Statistical  Issues 
Related  to  Combat  Modeling."  The  speakers  were  Hue  McCoy,  Bill 
Baker  (BRL) ,  and  Eugene  Dutoit  (Infantry  School) .  This  session 
achieved  its  purpose  of  stimulating  a  dialogue  between  combat 
modelers  and  the  statistical  community.  The  AMSC  members  feel  that 
the  addresses  by  the  principal  speakers,  as  well  as  the  contributed 
papers  by  Army  and  academic  personnel,  also  stimulated  the 
interchange  of  ideas  among  the  scientists  attending  this  meeting. 
Noted  below  is  the  list  of  invited  speakers  selected  by  the  Program 
Committee: 

Speaker  and  AfgUlafclqn 

Professor  Robert  Bechhofer 

Cornell  University 


Professor  William  J.  Conover 
Texas  Tech  University 

Professor  Gary  Koch 
University  of  North  Carolina 
at  Chapel  Hill 

Professor  David  W.  Scott 
Rice  University 

Another  event  associated  with  each  of  these  conferences  is  a  two- 
day  tutorial.  This  year,  Ronald  Hocking  of  Texas  A&M  University 
presented  a  tutorial  entitled  "Analysis  of  Linear  Models  with 
Unbalanced  Data."  It  was  held  two  days  before  the  start  of  the 
conference  and  was  conducted  in  the  TEC  Protocol  Building  at  Fort 
Ord. 


Title  of  Address 

An  Appraisal  of  Several 
Multistage  Selection 
Procedures 

Latin  Hypercube  Sampling,  a 
Way  of  Saving  Computer  Runs 

An  Overview  of  Statistical 
Methods  for  Categorical  Data 


Statistical  Data  Analysis 


1H 


As  the  master  of  ceremonies  at  the  banquet  and  the  recipient  of  the 
Wilks  Award  last  year,  Dr.  Marion  Bryson  had  the  honor  of 
announcing  the  winner  of  the  ninth  U.S.  Army  Wilks  Award,  Professor 
Boyd  Harshbarger.  He  was  selected  because  of  his  research 
endeavors,  his  promotional  activities  for  Army  applications,  his 
unending  supply  of  speakers  for  these  conferences,  and  his  help  in 
numerous  ways  to  carry  the  Army  forward  in  many  important 
statistical  areas.  Because  of  ill  health,  Professor  Harshbarger 
was  unable  to  attend  the  conference.  Dr.  Douglas  Tang, 
representing  the  Army  statistical  community,  accepted  the  award  on 
his  behalf. 

Members  of  the  Army  Mathematics  steering  committee  would  like  to 
thank  the  members  of  the  Program  Committee  for  guiding  this 
scientific  conference,  and  to  also  thank  the  Mathematical  Sciences 
Division  of  the  Army  Research  Office  for  preparing  the  proceedings 
of  these  meetings. 


Carl  Bates 
Eugene  Dutoit 
Douglas  Tang 


PROGRAM  COMMITTEE 

Robert  Burge 
Hue  McCoy 
Malcolm  Taylor 
Henry  Tingey 


Francis  Dressel 
Carl  Russell 
Jerry  Thomas 
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TRADOC  Test  and  Experimentation  Command 

Experimentation  Center  (TEC) 

WELCOMING  REMARKS 

GENERAL  SESSION  I 

Chairperson:  Marlon  R.  Bryson,  TRADOC  Test  and  Experimentation 
Command,  Experimentation  Center 

KEYNOTE  ADDRESS; 

AN  APPRAISAL  OF  SEVERAL  MULTISTAGE  SELECTION  PROCEDURES 
Robert  Bechhofer,  Cornell  University 

BREAK 

STATISTICAL  DATA  ANALYSIS 

David  W.  Scott,  Rice  University 

LUNCH 
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Nozer  Slngpurwal la,  George  Washington  University 

HAS  VARIABILITY  BEEN  REDUCED? 
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WHICH  DISTRIBUTION  APPLIES? 
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STATISTICAL  DATA  ANALYSIS; 

HOW  FAR  WILL  COMPUTER  GRAPHICS  TAKE  US? 

David  W.  Scott 
Department  of  Statistics 
Rice  University 
P.O.  Box  1892 
Houston,  Texas  77251-1892 

ABSTRACT.  In  this  paper  we  survey  the  directions  researchers  are  following  in  statistical 
graphics.  Hardware  support  for  animation  and  of  color  is  expanding  rapidly  while  price  is  at  least 
decreasing.  While  a  fairly  optimistic  scenario  can  be  drawn,  the  most  correct  statement  we  can 
make  about  the  future  of  graphics  and  statistical  computing  is  that  the  unccrtainity  has  never 
been  greater.  Potential  obstacles  towards  effective  use  of  computer  graphics  are  discussed,  particu¬ 
larly  in  the  academic  setting.  Strategics  to  break  these  bottlenecks  will  be  suggested.  Otherwise 
excess  CPU  cycles  may  remain  so. 

1.  INTRODUCTION.  Each  year  at  the  annual  meeting  of  the  National  Computer  Graphics 
Association,  a  gala  dinner  is  held  at  which  the  winners  of  various  computer  graphics  contests  are 
presented.  As  the  winning  computer-generated  images  and  videos  are  presented,  with  bumble  bees 
darting  among  Rowers  and  pool  balls  reflecting  the  images  of  a  futuristic  shiny  room,  one  is 
overwhelmed  by  the  shear  raw  power  and  impact  of  the  presentation.  There  is  not  (yet)  a  category 
for  statistical  presentation,  but  one  senses  this  is  not  out  of  the  question. 

The  impact  of  modem  computer  graphics  on  statistical  education  and  practice  has  not  yet 
been  great.  Eddy  et  al.  in  a  recent  article  in  Statistical  Sciences  have  attempted  to  describe  future 
computing  needs  and  trends,  and  graphics  is  an  important  part  of  the  overall  picture.  The  average 
statistician  letains  a  small  collection  of  typical  images  that  are  recycled  over  and  over;  scatter 
diagrams  including  residual  plots,  frequency  curves  such  as  histograms,  curve  fits  such  as  regression 
lines,  elliptical  contours  of  normal  densities  including  principal  components;  the  list  is  surprisingly 
small.  Far  more  emphasis  is  given  to  tables;  summary  statistics  tables,  chi-squared  tables, 
analysis  of  variance  tables,  tables  of  percentiles,  and  spreadsheets.  This  follows  the  natural  incli¬ 
nation  of  statisticians  to  present  a  parsimonious  summary  of  an  incidence  of  data  analysis:  choose 
a  powerful  model  well-studied  in  the  literature,  estimate  parameters  and  determine  significance, 
and  present  results  summarizing  the  model  in  tabular  and  sometimes  graphical  forms.  Image  pro¬ 
cessing,  animation,  rotation  are  all  very  un parsimonious  statistical  tools. 

Historically,  technology  has  affected  the  relative  importance  of  these  forms.  Early  data 
analysts  such  as  John  Graunt  and  William  Petty  favored  tabular  presentation,  after  all,  paper  was 
a  dear  commodity.  William  Playfair  showed  the  array  of  graphical  presentation  of  business  data 
was  worth  the  paper.  Computation  was  expensive,  and  the  human  effort  required  for  creating 
effective  graphs  was  relatively  cost-effective.  Karl  Pearson  began  the  trend  towards  testing  and 
tabular  presentation,  but  devoted  much  energy  to  graphs  in  the  form  of  frequency  curves.  Fisher 
and  others  accelerated  the  tabular  form  with  analysis  of  variance  and  maximum  likelihood,  which 
emphasizes  parametric  analysis  over  the  more  graphical  nonparametric  analysis.  The  emphasis 
was  on  mathematical  statistics.  The  rapid  increase  in  number  crunching  ability  spawned  the  crea¬ 
tion  of  statistical  packages,  with  largely  numerical  output.  Graphics  was  not  ignored  in  such  pack¬ 
ages  (certainly  not  in  the  past  few  years),  but  the  quality  was  relatively  tow  and  options  limited. 
Quality  graphics  output  is  still  much  more  expensive  than  computing,  but  the  absolute  price  of 
both  has  decreased  so  dramatically  that  we  are  seeing  an  explosion  of  interest  in  graphical  statis¬ 
tics.  Truly  impressive  packages  for  personal  computers  are  available  and  SAS  and  SPSS  have  pro¬ 
vided  similar  Capabilities  for  mainframes.  Separately,  many  non-statistical  companies  provide 
software  for  presentational  graphics,  aimed  at  business  markets,  ISCOL  is  one  example,  but  such 
quality  products  cost  even  academic  workers  many  thousands  of  dollars, 


2.  CURRENT  IMPACT  OF  COMPUTER  GRAPHICS,  How  strong  has  the  impact  of  com¬ 
puter  graphics  been  on  the  statistical  community?  To  look  at  many  journals  and  statistical  text¬ 
books,  you  would  be  hard  pressed  to  detect  any  revolution.  In  its  fourth  edition,  Hogg  and  Craig's 
classical  textbook  on  mathematical  statistics  contains  only  five  figures!  The  Journal  of  the  Ameri¬ 
can  Statistical  Association  is  showing  the  change,  but  in  unexpected  ways.  Roughly  half  of  the 
papers  contain  only  tables.  Those  with  figures  contain  more  figures  than  papers  ten  years  ago,  but 
ironically  the  quality  is  poorer.  Ten  years  ago  artwork  was  professionally  drawn  (if  only  approxi¬ 
mating  truth).  Many  figures  today  are  drawn  by  PC’s,  which  are  acceptable  but  clearly  inferior  in 
presentation  quality  and  impact  of  their  professional  cousins.  But  the  cost  is  so  much  less  that  we 
accept  substandard  quality.  The  very  recent  increase  in  laser  graphical  output  partially  justifies 
the  premature  switch  to  PC  graphics. 

The  long  and  short  of  it  is  that  wc  are  within  five  years  of  everyone  having  the  ability  to 
produce  very  high  quality  two-dimensional  graphics  virtually  without  cost.  In  other  words,  wc 
have  succeeded  in  automating  the  kinds  of  graphs  William  Playfair  drew  200  years  ago. 

3.  NEW  DIRECTIONS  IN  COMPUTER -.GRAPHICS,  The  emphasis  of  this  paper  is  on 
how  much  farther  will  computer  graphics  take  statistics?  Why  is  there  a  trend  towards  newer 
graphical  presentations?  Graphics  is  at  odds  with  classical  statistics  because  graphics  is  non- 
parsimonious.  A  graph  cannot  be  neatly  summarized  or  reduced  to  a  few  key  coefficients  and  p  - 
values.  Graphs  demand  close  scrutiny  and  invite  speculation  and  interpretation,  something  hardly 
ever  seen  in  parametric  analyses.  But  the  fundamental  distinguishing  feature  is  that  graphs  arc 
subjective,  imprecise,  manipulative,  yet  powerful.  One  novel  multivariate  graph  is  the  Chcmoff 
face.  An  entire  conference  in  1978  was  devoted  to  evaluating  the  subjective  aspects  of  this  tech¬ 
nique,  in  particular,  coping  with  the  almost  infinite  possible  alternative  constructions  for  individual 
datasets.  There  is  no  consensus  whether  it  is  a  serious  statistical  tool.  The  discipline  of  statistics 
attempts  to  be  very  precise  about  its  imprecision,  and  many  statisticians  do  not  find  graphs  precise 
enough  to  serve  as  the  analysis,  preferring  tables  and  statistics. 

Yet  the  whole  new  technology  of  computer  graphics  and  enhanced  graphics  chips  has  opened 
up  the  possibility  of  a  new  generation  of  presentation  graphics.  More  statisticians  are  focusing 
their  research  effort  in  this  area,  and  are  represented  by  the  new  ASA  section  called  statistical 
graphics.  The  concerns  about  limitations  of  the  old  style  graphics  are  even  more  critical  in  the 
new  style  of  graphics.  The  key  additional  features  are  color,  solids  rendering,  transluccncy,  and 
animation;  the  Pixar  machine  is  the  state-of-the-art  for  all  of  these  features.  If  wc  consider  the 
exploratory  graphical  tools  for  high  dimensional  data,  wc  see  that  an  important  part  of  datu 
analysis  is  luck.  For  the  higher  the  dimension,  the  smaller  the  fraction  of  data  that  can  be 
“explored”  in  a  given  amount  of  time.  Thus  different  workers  examining  the  same  multivariate 
data  will  probably  sec  disjoint  parts  of  it  -  quite  in  contrast  to  a  parametric  world  using  principal 
components.  Even  the  order  in  which  the  data  are  examined  can  be  a  factor,  given  the  inevitable 
fatigue.  Some  research  is  already  under  way  to  help  automate  the  searching  process  (reminds  me 
of  the  computer  science  project  to  automate  the  game  Rogue,  called  rogomatic).  But  real  objec¬ 
tions  have  been  made  about  this  imprecise  form  of  data  analysis.  The  use  of  color  excludes  those 
who  arc  color  blind.  The  use  of  stereo  viewing  techniques  is  maddeningly  unsuccessful  for  a  large 
percentage  of  professionals.  Each  new  subjective  element  Increases  the  power  of  the  data  analysis 
but  decreases  the  reliability  and  widespread  usefulness  of  these  techniques.  Publishing  is  virtually 
impossible,  until  CD-ROM  publishing  is  available.  A  nonexhaustive  list  of  projects  includes:  pro¬ 
jection  pursuit  (Tukcy,  Friedman,  Stuetzle);  animated  scatter  plots  (Tukey,  Huber,  Donoho); 
exploratory  methods  (Tukey  and  Tukey);  density  estimation  (Scott,  Thompson,  Tarter);  glyphs 
and  stereo  (Carr  and  Nicholson);  grand  tours  (Buja  and  Asimov);  programming  languages  (Becker, 
Chambers,  Donoho,  Huber);  programming  environments  (McDonald). 

4.  MANAGING  THE  FUTURE.  But  enough  about  how  hard  it  ail  will  be  and  how  unap¬ 
preciated  it  all  may  be.  Are  wc  going  to  be  able  to  sustain  research  in  novel  statistical  graphics? 
As  an  engineering  undergraduate  in  1968,  I  used  to  wait  in  line  to  use  a  Wang  time-sharing  calcu¬ 
lator  terminal  (it  actually  could  do  the  transcendental  functions  to  twelve  significant  digits!).  Once 
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we  began  doing  our  number  crunching  through  programming  languages,  we  could  accept  and  track 
the  new  computing  resources  with  almost  no  overhead.  So  in  the  past  fifteen  years,  I  have  written 
Fortran  (and  PL/ 1)  programs  on  as  many  types  of  hardware.  The  only  overhead  was  learning  a 
new  editor,  a  few  system  commands,  and  the  faster  and  bigger  machine  was  immediately  increas¬ 
ing  productivity  and  opening  new  horizons.  There  is  still  a  bit  more  of  that  to  be  had.  With  the 
workstations  now  available,  we  have  finally  obtained  the  luxury  of  wasting  a  huge  fraction  of  CPU 
cycles.  This  is  of  course  a  correct  state  of  affairs  given  the  relative  cost  of  faculty  time.  Idle  CPU 
seconds  are  costly  only  in  terms  of  maintenance;  idle  graphics  workstations  cannot  yet  be  justified 
as  maintenance  costs  are  very  high. 

But  we  must  face  two  developments.  The  first  is  parallel  computing.  The  second  is  graphics. 
Statisticians  can  probably  make  the  most  effective  use  of  parallel  computers  than  any  single  group 
of  researchers,  because  much  of  our  computing  involves  very  loosely  coupled  computation  such  as 
Monte  Carlo  simulation.  Numerical  analysts,  on  the  other  hand,  face  tightly  coupled  computation 
which  provides  real  gains  only  in  rather  specific  situations.  Theoretical  limits  exist  to  performance 
in  tightly  coupled  systems,  no  matter  how  many  parallel  processors  are  available.  But  all  that 
aside,  to  effectively  use  hypcrcube  or  other  parallel  architectures  is  not  a  straightforward  exercise. 
It  is  even  worse  than  having  to  give  up  your  favorite  programming  language  and  return  to  assem¬ 
bler,  Serious  allocation  of  time  and  other  supporting  resources  must  be  made  at  this  time.  One 
reaction  is  that  it  is  not  worth  the  effort  and  just  to  wait  until  some  computer  scientist  writes  an 
incredible  parallel  compiler  that  takes  non-parallei  code  and  optimizes  into  parallel  environments. 
(Not  too  likely  in  my  opinion.  Gene  Golub  at  Stanford  in  a  comment  after  a  lecture  by  John  Rice 
lamented  that  there  weren’t  enough  numerical  analysts  to  go  around  to  try  and  make  parallel  algo¬ 
rithms  for  each  differential  equation  and  hardware  configuration.) 

Graphics  presents  the  same  challenge.  With  more  modest  effort,  one  can  produce  useful  pic¬ 
tures  on  a  PC  or  graphics  terminal  of  the  William  Playfair  variety.  Playing  with  the  color  tables 
can  be  fun.  Choosing  the  specific  256  colors  from  the  16,777,216  choices  can  bo  a  bit  frustrating. 
Graphics  chips  have  helped  enormously,  putting  frequently  used  graphical  transformations  into 
hardware  and  supporting  animation.  The  interface  with  these  chips  is  at  about  the  same  level  as 
other  graphics  commands,  almost  at  the  assembler  level,  pixel  by  pixel.  Some  systoms  arc  avail¬ 
able  ut  the  command  level  to  avoid  this,  but  the  convenience  eventually  becomes  the  limitation, 
both  In  functionality  and  performance.  At  a  somewhat  lower  level,  graphics  standards  have 
appeared,  such  as  CORE  and  GKS.  But  any  commercial  outfit  will  admit  that  the  advantages  of 
portability  are  outweighed  by  the  benefits  of  performance  allowed  by  assembler  programming.  But 
most  academics  are  satisfied  by  “prototype"  systems  rather  than  commercial  performance. 

My  observation  is  that  with  graphics  systems  it  is  very  difficult  to  build  upon  previous  work. 
Euch  new  generation  of  hardware  demands  a  complete  new  attack.  As  the  graduate  students  who 
did  tho  previous  system  disappear,  the  next  generation  of  students  have  a  more  difficult  task  get¬ 
ting  up  to  speed.  For  the  better  hardware  often  has  many  more  capabilities,  so  reproducing  the 
previous  system  often  much  harder.  Therefore,  less  time  is  available  for  extending  the  previous 
system  and  actually  less  research  gets  done.  This  is  a  bit  overdrawn,  but  accurately  reflects  what 
has  happened  over  the  past  fifteen  years.  At  Berkeley,  a  biostatistical  researcher  developed  a 
analysis  and  graphical  system  on  some  IBM  hardware  that  he  nursed  for  eight  years  beyond  its 
supported  lifetime,  before  finally  biting  the  bullet  and  updating  hardware.  At  Rice  and  Stanford 
and  other  places,  graduate  students  who  worked  on  very  specialized  hardware  and  produced  very 
useful  systems,  graduated  and  went  away.  What  was  left  was  a  collection  of  faculty  who  had 
directed  the  research  but  who  did  not  have  the  time  to  actually  program  the  system,  maintain  it, 
or  even  fully  understand  it.  Thus  the  next  generation  of  graduate  student  basically  found  it 
impossible  to  effectively  use  the  machines.  Maintenance  costs  and  down-time  were  significant  as 
the  expensive  hardware  aged,  and  using  the  previous  student’s  system  frustrating  (and  not 
research).  The  apparent  time  to  start  new  and  create  a  wholly  new  system  was  determined  too 
risky,  since  rumors  that  the  machine  might  be  sold  (since  no  one  was  using  it)  began  to  circulate. 
The  traditionally  successful  faculty/ graduate  student  relationship  was  found  wanting.  The  need 
for  continuity  implied  the  need  for  a  new  type  of  person  in  the  picture  (nontradit ional),  the  staff 
support  group.  These  persons  con  usually  be  recruited  from  recent  graduates  by  offering  post-docs, 
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research  positions,  and  other  positions  not  commonly  found  in  statistics  groups.  Thus  there  is  a 
need  to  restructure  research  personnel  to  continue  this  work.  The  systems  are  too  complex  for 
individual  faculty  to  manage  (much  less  to  retrain  unproductive  faculty).  Fewer  and  fewer  gradu¬ 
ate  students  are  able  to  master  the  complexities  of  these  systems  in  the  few  years  available  and 
make  real  contributions.  Those  who  can  leave  quickly,  leaving  behind  a  serious  void  in  continuity, 
rendering  expensive  equipment  unusable  almost  overnight.  These  statistics  and  computer  science 
wizards  are  not  well-recognized  as  doing  valid  statistical  research  worthy  of  tenure  track  (as 
opposed  to  statistical  computing).  The  result  is  inability  to  do  the  desired  research,  which  neces¬ 
sarily  includes  extensive  systems  development.  We  seem  to  be  moving  towards  the  system  used  by 
sciences,  many  post-docs  per  faculty  member  as  well  as  support  staff  to  provide  full-time  research 
effort  and  continuity  of  systems  expertise  and  support,  something  that  cannot  be  even  partially 
satisfied  by  faculty  and  students  alone.  Unfortunately,  the  job  market  is  so  strong  in  statistics  as 
opposed  to  these  other  areas  that  it  will  be  very  difficult  to  build  up  new  centers  and  move 
towards  the  big  research  lab  model. 

This  will  be  a  rather  traumatic  trend.  It  is  well-known  that  using  programmers  greatly 
reduces  output  (due  to  decreased  reliability  of  code  and  less  intimate  knowledge  of  the  problem) 
and  decreases  hands-on  experimentation  that  leads  to  new  developments,  but  senior  faculty  time 
can  not  usually  be  allocated  significantly  for  this  purpose.  Debugging  purely  graphical  systems  is 
extraordinarily  difficult.  Dr.  Banchoff  at  Brown  University  reports  that  Roger  Penrose  found  a  bug 
in  a  four-dimensional  hidden-line  removal  algorithm  by  simply  watching  it  perform.  Testing  will 
be  an  enormous  headache  and  problem.  Everything  looks  so  pretty  when  the  output  is  graphics. 
Difficult  to  be  critical.  We  have  watched  computer  science  departments  try  and  manage  very  large 
development  projects.  Statistical  researchers  will  have  to  pay  attention  to  how  those  efforts  have 
been  organized  and  managed.  Statisticians  seem  to  be  a  bit  Impatient  and  more  satisfied  with  pro¬ 
totypes  of  systems  than  is  healthy  for  the  profession. 

Another  approach  has  been  to  move  to  novel  computing  environments  that  hold  the  promise 
of  improved  user  productivity  and  portability.  The  LISP  machines  fall  into  this  category. 

At  Batlcllc  Labs  in  Richland,  Washington,  Wes  Nicholson  and  Dan  Carr  huvo  pioneered 
research  into  the  use  of  glyphs  and  stereo  viewing  for  data  analysis.  In  1983  they  invited  a  dis¬ 
tinguished  panel  of  statisticians  and  computer  scientists  to  review  and  criticize  their  progress.  It  is 
clear  from  the  reprinted  papers  and  discussion  that  the  visitors  could  not  decide  what  was  "funda¬ 
mental  research”  and  what  was  merely  "systems  development.”  This  lack  of  a  clear  understanding 
of  the  joint  rotes  of  these  activities  has  hindered  the  professional  development  of  many  young 
computer-bound  statisticians. 

5.  CONCLUSIONS.  Wc  asked  the  question  of  how  far  will  computer  graphics  tukc  us?  The 
answer  is  u  long  wuy,  but  not  with  the  current  research  structure.  Graphics  requires  as  much  sup¬ 
port  as  supcrcomnuting  or  parallel  architectures,  but  may  not  get  it  directly.  Many  of  the  sciences 
and  engineering  departments  have  received  adequate  laboratory  resources  and  statistics  must  be 
added  to  the  list.  The  need  for  and  trend  towards  graphics  con  not  be  altered,  but  we  can  work  on 
improving  presentation  quality  and  effectiveness,  such  as  Bill  Cleveland  (1985)  and  others  have 
been  attempting  to  evaluate.  Statisticians  have  contributed  much  to  the  burgeoning  field  of 
"scientific  visualization,"  but  it  is  computer  scientists  who  have  dominated  the  funding  in  the  field. 
A  closer  working  relationship  to  the  fields  of  application  is  already  occurring  but  more  should  be 
expected.  Finally,  examples  of  figures  shown  in  the  original  talk  may  be  found  in  the  references 
below. 
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ABSTRACT 

The  standard  U.S.  Army  desert  camouflage  uniform  appears  dark  against  U.S.  and  Saudi 
Arabian  desert  backgrounds.  Prototype  uniforms  were  developed  and  evalup  ted  in  the  desert 
Southwest  in  1986.  Test  results  led  to  further  evaluation,  in  1987,  of  seven  m  v  uniforms,  plus 
the  standard  uniform.  Uniforms  were  shown  in  all  possible  pairs,  at  ten  sit  s,  to  U.S.  Marine 
Corps  and  Fort  Belvoir  personnel,  who  served  as  ground  observers.  The  uniforms  were  judged 
on  their  ability  to  blend  with  the  background.  The  best  of  each  pair  was  independently 
selected.  An  analysis  of  variance  and  Duncan’s  Multiple-Range  Test  statistics  were  performed. 
It  was  determined  for  most  sites,  and  across  all  sites,  that  three  new  uniforms  were 
significantly  (a  £  0.05)  best  in  blending  with  the  background. 

1.0  SECTION  1  -  INTRODUCTION 

The  standard  U.S.  Army  desert  camouflage  uniform  is  made  in  a  pattern  consisting  of 
six  colors.  The  predominant  color  areas  arc  tan,  khaki,  light  brown,  and  dark  brown.  Small 
light-brown  areas  outlined  in  black  arc  scattered  throughout  the  other  color  areas.  This 
uniform  was  taken  to  Saudi  Arabia  in  1980,  and  viewed  against  multiple  desert  backgrounds. 
In  all  cases  the  uniform  appeared  dark  and  did  not  blend  well  with  any  of  the  observed  desert 
backgrounds.  This  information  was  given  to  counter-surveillance  personnel  at  Natick  RD&E 
Center,  MA.  A  series  of  seven  prototype  desert  uniforms  wa.s.  then  made  and  given  to  Fort 
Belvoir  for  a  desert  evaluation  in  1986.  Analysis  of  this  data1'  identified  uniforms  4,  5,  and 
6  as  being  the  most  effective  in  terms  of  blending  with  the  U.S.  desert  test  sites  investigated. 

Using  the  additional  test  information  collected  by  Belvoir  as  a  basis,  Natick  then 
developed  uniforms  8,  9,  10,  and  11  for  further  evaluation.  These  uniforms,  along  with 
uniforms  4,  5,  and  6  and  the  standard  U.S.  Army  uniform,  identified  as  uniform  I,  were 
evaluated  in  the  U.S.  desert  Southwest  in  1987.  The  quantitative  analysis  of  their  ability  to 
blend  with  various  Southwest  desert  backgrounds  is  the  subject  of  this  report. 

2.0  SECTION  2  -  PROCEDURE 

2.1  Test  Uniforms 

A  total  of  eight  camouflage  uniforms  were  evaluated.  The  following  is  a  description 
of  each  uniform: 

♦  Uniform  #1-Standard  U.  S.  Army  Desert  Day  Camouflage  Pattern 

A  six-color  pattern  now  in  use  by  the  U.S.  military  consisting  of  the  colors  Light  Tan 

379*,  Tan  380*,  Light  Brown  3S1*,  Durk  Brown  382*,  Black  383*,  and  Khaki  384*. 

♦  Uniform  #4 

A  three-color  pattern  of  Light  Tan  379*,  Khaki  384*,  and  Light  Brown  381*. 
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♦  Uniform  #5 

A  three-color  pattern  of  Light  Tan  379*,  Tan  380*,  and  Khaki  384*. 

♦  Uniform  #6 

A  three-color  pattern  of  Desert  Tan  459*,  Khaki  384*,  and  Light  Brown  381*. 

♦  Uniform  #8 

A  solid-color  uniform  of  Tan  380*. 

♦  Uniform  #9 

A  solid-color  uniform  of  Khaki  384*. 

♦  Uniform  #10 

A  three-color  pattern  of  Khaki  384*,  brown**  and  sand**. 

♦  Uniform  #1 1 

A  two-color  pattern  of  clay**  and  Khaki  384*. 

*Natick  numerical  color  designations 
**No  numbers  assigned 

2,2  Test  Sites 

A  total  of  ten  sites  were  selected  for  the  study.  All  the  desert  sites  contained  sparse 
vegetation  similar  to  that  found  in  areas  of  interest  in  the  Middle  East.  The  soil  ranged  in 
color  from  a  light  buff/tan  to  gray  and  dark  brown,  and  represented  a  good  cross-sectional 
spectrum  of  different-colored  desert  backgrounds.  The  order  of  the  ten  sites  as  they  will 
appear  throughout  this  study  is  seen  in  Table  1. 

Table  1 

Site  Order  Identification 


Site  # 


Color 


Location 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 


Buff 

Light  Gray 
Very  Light  Tan 
Dark  Beige  Tan 
Light  Tan 
Dark  Tan 
Beige  Tan 
Light  Beige  Tan 
Tan 

Gray  Tan 


Yuma  Sand  Dunes,  AZ 
Ogilby  Road,  Tumco,  CA 
Yuma  Proving  Grounds,  AZ 
Anza  Borrego  State  Park,  CA 
Tank  Trail,  29  Palms,  CA 
Salton  Sea,  CA 
Anza  Borrego  State  Park,  CA 
Anza  Borrego  State  Park,  CA 
Jean  Dry  Lake  Bed,  NV 
Rt.  15,  Baker,  CA 


2.3  Test  Subjects 

The  test  subjects  consisted  of  U.S.  Marine  Corps  enlisted  men  from  Camp  Pendleton, 
CA,  and  civilians  from  the  U.S.  Army  Natick  Research,  Development,  and  Engineering  Center, 
Natick,  MA,  and  the  U.S.  Army  Belvoir  Research,  Development,  and  Engineering  Center,  Fort 
Bclvoir,  VA.  A  maximum  of  15  observers  to  a  minimum  of  10  observers  were  used  at  each  test 
site.  All  subjects  had  at  least  a  corrected  visual  acuity  of  20/30  and  normal  color  vision. 


2.4  Data  Generation 

The  eight  uniforms  were  viewed,  individually,  in  all  possible  pairs  (28).  The  viewing 
distance  from  the  subject  to  each  pair  of  uniforms  was  about  25  meters.  The  observers  were 
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told  to  select  the  one  uniform  from  each  pair  that  best  matched  or  blended  with  the 
surrounding  background  in  terms  of  color.  The  observers  were  instructed  to  discount 
shrubbery  if  present.  This  instruction  was  necessary,  because  of  the  very  sparse  shrubbery  in 
the  deserts  of  the  Middle  East  when  compared  with  the  U.S.  desert  Southwest.  The  mean 
preference  with  associated  standard  error,  95%  confidence  intervals,  analysis  of  variance, 
and  Duncan's  Multiple-Range2'  were  calculated  for  all  sites,  and  averaged  across  all  ten  sites. 
The  higher  the  mean  preference,  the  more  preferred  the  colors  were  rated  by  the  ground 
observers  as  blending  with  the  desert  background. 

3.0  SECTION  3  -  RESULTS 

The  camouflage  uniforms  were  evaluated  at  each  of  the  ten  sites  to  determine  which 
colors  best  blended  with  the  desert  environment.  Section  2.4  describes  how  the  data  was 
generated  for  all  sites,  and  when  averaged  across  all  sites.  Table  2  shows  the  uniforms  that 
best  blended  with  each  site  and  when  averaged  across  all  sites. 

Table  2 


Summary  of  the  Best  Desert  Uniforms  for  Each  Site 

In  Ability  to  Blend  with  the  Background 

Uniforms 

1  4 

5 

6 

8 

9 

10 

11 

Site  1 

X 

X 

Site  2 

X 

X 

X 

X 

Site  3 

X 

X 

X 

X 

X 

Site  4 

X 

X 

X 

Site  5 

X 

X 

Site  6 

X 

X 

X 

X 

Site  7 

X 

X 

X 

Site  3 

X 

X 

Site  9 

X 

X 

X 

Site  10 

X 

X 

X 

Across  All 

Sites 

X 

X 

X 

The  statistical  results  of  each  site  for  the  above  best  camouflage  uniforms  will  not  be 
included,  because  they  would  be  too  voluminous  to  present  in  these  proceedings.  This  data  is 
available  upon  request  from  the  U.S.  Army  Bclvoir  Research,  Development  and  Engineering 
Center,  ATTN:  STRBE-JDA,  Fort  Bclvoir,  VA  22060.  Tabic  3  contains  the  menn  preference 
with  associated  standard  error  and  95%  confidence  interval  for  the  ability  of  the  desert 
uniforms  to  blend  with  the  background,  when  averaged  across  all  sites.  Figure  1  Is  the  graphic 
display  of  Table  3.  Table  4  is  the  analysis  or  variance  performed  to  determine  if  there  arc 
significant  differences  between  the  various  camouflage  uniforms  in  their  ability  to  blend  with 
the  desert  backgrounds.  Table  5  identifies  which  uniforms  differ  f rom  each  other  through  the 
Duncan’s  Multiple-Range  Test. 
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Table  3 

Mean  Preference  Rating  for  Desert  Background  Blend 
and  95-Percent  Confidence  Intervals  (Across  All  Sites) 


Uniform 

N 

Mean 

Standard 

Error 

95%  Confidence  Interval 
Lower  Limit  Upper  Limit 

1 

116 

0.8190 

0.0761 

0.6683 

to 

0.9696 

4 

116 

4,3966 

0.1266 

4.1458 

to 

4.6473 

5 

116 

4,7845 

0.1340 

4.5190 

to 

5.0500 

6 

116 

2,5345 

0.1725 

2,1928 

to 

2.8761 

8 

116 

4,5000 

0.1197 

4,2630 

to 

4.7370 

9 

116 

0,9397 

0.0902 

0,7610 

to 

1.1184 

10 

116 

3,9655 

0.1278 

3.7124 

to 

4.2187 

11 

116 

3.6466 

0.1878 

3,2745 

to 

4.0186 

H  IQH 

5,0300 

0 

8,0- 

** 

4.847a  T. 

4, 1170 
r 

z 

X  ^ 

+ 

4 , 3187 

w 

T  4,3180 

r  4,0188 

J 

•4  0" 

* 

4 , 1438 

A . 2030 

f  1 

m 

3,7134  ^ 

0 

3.0- 

- 

2  .  U7B*1 

3. 0743 

h 

>• 

a.o- 

T 

b 

?. .  1828 

- 

1  1184 

J 

1,0- 

a . seas 

T 

* 

i r 

m 

0 . 8803 

0,7310 

< 

1  1  1 

1  1 

1 . 1  -  I 

LOW 

O  ,  0 

14  3 

3  8 

8  10  11 

CAMOUPL.AQB  UNIFORM 


Figure  1 

Desert  Camouflage  Uniform  Ability  to  Blend  with  the  Desert  Background, 
Means,  and  95-Percent  Confidence  Intervals  (Across  All  Sites) 

Table  4 

Analysis  of  Variance  for  the  Ability  of  the  Camouflage 
Uniforms  to  Blend  with  the  Desert  Background  (Across  All  Sites) 


Degrees  of 

Sum  of 

Source 

Freedom 

Squares 

Mean  Square 

F-Test 

Level 

Uniforms 

7 

2046.1379 

292.3054 

140.4009 

0.0000* 

Error 

920 

1915.3793 

2.0819 

Total 

927 

3961.5172 

Bartlett’s  Test  for  Homogeneous  Variance 
Number  Degrees  of  Freedom  -  7 
F  ■  19,23  Significance  Level  -  0,000** 

*Significant  at  a  less  than  0,001  level 
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Table  4  indicate.  ;hat  there  are  significant  differences  in  the  ability  of  the  camouflage 
uniforms  to  blend  with  the  desert  background.  The  Bartlett’s  Test  indicates  that  the  variance 
for  each  uniform  is  not  homogeneous,  i.c.,  significantly  different,  so  they  arc  not  necessarily 

from  the  same  population. 

Table  5 

Duncan’s  Multiple-Range  Test 
for  All  Sites  Combined,  Daylight 


BEST 

1  UNIFORM  4 

4.3966 

UNIFORM  8 
4.5000 

UNIFORM  5 
4.7845 

2  UNIFORM  1 1 

3.6466 

UNIFORM  10 
3.9655 

3  UNIFORM  6 

2.5345 

WORST 

4  UNIFORM  1 

0.8190 

UNIFORM  9 
0.9397 

4.0  SECTION  4  -  DISCUSSION 


A  review  of  the  data  for  sites  1-10,  and  for  all  sites  combined,  shows  that  camouflage 
uniforms  4,  5,  and  8  were  the  most  effective  in  blending  with  the  desert  terrain.  These 
uniforms  had  mean  blending  values  of  4.3966,  4.7845,  and  4.5000  respectively  (Tables  3  and 
5).  With  the  exception  of  site  5  (Table  2),  where  camouflage  uniforms  6  and  10  were  judged 
as  best  blending  with  the  desert  background,  uniforms  4,  5,  and  8  had  at  least  one  member 
among  those  that  blended  best  with  the  desert  background.  The  overall  mean-blending  values 
for  the  uniforms  do  not  differ  significantly  from  each  other  (Table  5  and  Figure  11. 
Additional  review  of  the  data  indicates  that  the  standard  camouflage  uniform  (*±I)  and 
uniform  9  had  the  worst  blend  with  the  desert  background,  when  averaged  across  all  sites. 

The  data  for  this  study  appears  fairly  clean;  however,  one  large  and  pressing  caveat 
must  be  taken  into  consideration,  before  any  final  decision  on  desert  uniforms  is  made.  The 
uniform  tests  conducted  so  far  have  been  in  the  U.S.  desert  Southwest.  Any  future  conflicts 
in  which  a  desert  camouflage  uniform  will  be  used  by  U.S.  forces  will,  in  all  probability,  be 
in  the  Middle  East.  These  deserts  tend  to  be  lighter  and  more  tan  than  the  grayer  desert  of  the 
United  States.  They  also  have  much  less  vegetation.  The  best  camouflage  uniforms  from  this 
study  should  be  evaluated  in  the  areas  of  interest  in  the  Middle  East  for  final  determination 
as  to  color  blend  with  the  background.  The  resulting  data  may  necessitate  color  modifications 
of  the  uniforms  to  ensure  that  the  best  possible  blend  with  the  deserts  of  interest  is  achieved. 

5.0  SECTION  5  -  SUMMARY  AND  CONCLUSIONS 

A  total  of  eight  camouflage  uniforms  were  evaluated  as  to  their  ability  to  blend  with 
desert  backgrounds  in  the  U.S.  desert  Southwest.  Ten  sites  were  used.  The  uniforms  were 
viewed  in  all  possible  pairs  (28),  and  with  the  one  selected  from  each  pair  that  blended  best 
with  the  background.  The  results  of  this  evaluation  produced  the  following  conclusions: 

a.  Camouflage  uniforms  4,  5,  and  8  blended  best  with  the  U.S.  desert  backgrounds. 

b.  Standard  camouflage  uniform  1  and  prototype  uniform  9  were  the  least  effective 
in  blending  with  the  U.S.  desert  backgrounds. 

c.  An  additional  desert  camouflage  evaluation  should  be  conducted  in  the  Middle  East, 
to  ensure  that  the  best  uniform  is  selected  for  the  U.S.  military. 
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1.  Letter  Report,  STRBE-JDS  to  U.S.  Army  Natick  RD&E  Center,  "Development  of  an 
Effective  Desert  Camouflage  Pattcm/Color  for  Uniforms,"  11  July  86. 

2.  Natrella,  Mary  G.,  Experimental  Statistics.  National  Bureau  of  Standards  Handbook  91, 
U.S.  Department  of  Commerce,  Washington,  D.C.,  1966. 
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HAS  VARIABILITY  BEEN  REDUCED? 


Gary  Aasheim 

U.S.  Army  Armament,  Munitions  and  Chemical  Command 
Pro, duct  Assurance  and  Test  Directorate 
Tool  and  Equipment/Aircraf t  Armament  Branch 
Rock  Island)  Illinois  61299-6000 


Often  changes  are  made  in  measuring  methods  and  in  production  methods 
with  at  best,  only  ahecks  to  determine  whether  or  not  the  changes  affected 
variability.  After  a  change  is  made,  a  natural  question  is  -  Did  the 
change  affect  measurement  precision  or  produot  uniformity? 

I  am  not  aware  of  an  established  method  for  analyzing  before  and 
after  sample  results  to  answer  that  question  for  all  situations.  Of 
course,  if  the  before  and  after  change  samples  are  from  the  same 
population,  the  standard  7-test  can  be  used. 

But  sometimes  the  before-change  samples  are  from  one  set  of 
populations  and  the  after-ahange  samples  are  from  a  different  set  of 
populations . 

One  method  for  dealing  with  this  situation  is  to  compare  the  pooled 
before  change  varianae  with  the  pooled  after  change  variance  using  an 
F-test,  However,  if  one  or  both  sets  of  populations  are  heteroscedas t i c , 
this  method  seems  to  be  of  marginal  soundness.  What  are  some  possible 
approaches  for  dealing  with  this  latter  situation? 

,1 


13 


WHICH  DISTRIBUTION  APPLIES? 


Gary  Aasheim 

U.S.  Army  Armament,  Munitions  and  Chemical  Command 
Product  Assurance  and  Test  Directorate 
Tool  and  Equipment/ Aircraf.t  Armament  Branch 
Rock  Island,  Illinois  61299-6000 


1.  Faced  with  the  questions  -  do  the  sample  measurements  support  the 
customer’s  belief  that  a  given  dimensional  requirement  was  not  met  to  the 
degree  required  by  the  contract,  and,  if  not,  what  dimensional 
requirements  could  be  met  to  the  required  degree?  -  a  co-worker  of  mine 
took  the  60  sets  of  20  readings  (see  below)  and  checked  for  normality  by: 

a.  transforming  the  readings  in  each  set  by  dividing  each  difference, 
reading  minus  set  sample  average,  by  the  set  sample  standard  deviation. 

b.  treating  the  1200  transformed  readings  as  a  single  sample  of  1200. 

c.  finding  the  average,  standard  deviation,  skewness  and  kurtosis  of  the 
transformed  readings,  plus  the  standard  deviations  of  the  latter  two 
statistics  based  upon  the  assumption  that  the  1200  readings  were  from  a 
normally  distributed  population. 

d.  breaking  the  transformed  readings  by  size  into  26  groups  and  running 
a  chi-square  goodness-of - f it  test  where  the  expected  values  were  based 
upon  the  normal  distribution. 

2.  Two  considerations  drove  the  transforming  and  pooling  efforts  above. 
First,  running  60  tests  for  normality  would  have  taken  more  time  and  work 
chan  the  approach  taken.  Second,  when  my  co-worker  gained  an  initial 
acquaintance  with  the  data  by  computing  sample  averages  and  standard 
deviations  and  by  counting  readings  outside  the  dimensional  requirements, 
he  did  not  spot  any  obviously  atypical  readings  and,  so,  felt  that  an 
assumption  of  a  single  underlying  statistical  distribution  with  different 
parameters  for  different  populations  was  reasonable. 

3.  Is  there  a  better  approach  than  that  used  by  my  co-worker? 


Preceding  Page  Blank 
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STATISTICALLY  BASED  MATERIAL  PROPERTIES 


Donald  M.  Neal  and  Mark  G.  Vangel 
U.S.  Army  Matariala  Technology  Laboratory,  SLCMT-MRS 
Watertown,  Massachusetts  02172-0001 


ABSTRACT 


This  paper  describes  statistical  procedures  and  their  importance 
in  obtaining  composite  material  property  values  in  designing  struc¬ 
tures  for  aircraft  and  military  combat  systems.  The  property  value  is 
such  that  the  strength  exceeds  this  value  with  a  prescribed  probabil¬ 
ity  with  951  confidence  in  the  assertion.  The  survival  probabilities 
are  the  99th  percentile  and  90th  percentile  for  the  A  and  B  basis 
values  respectively.  The  basis  values  for  strain  to  failure  measure¬ 
ments  are  defined  in  a  similar  manner.  The  B  value  is  the  primary 
concern  of  this  paper. 


INTRODUCTION 


Many  traditional  structural  materials,  which  are  homogeneous  and 
Isotropic,  differ  from  composite  materials  which  have  extensive 
intrinsic  statistical  variability  in  many  material  properties.  This 
variability,  particularly  Important  to  strength  properties,  is  due  not 
only  to  inhomogeneity  and  anisotropy,  but  also  to  the  basic  brittle¬ 
ness  of  many  matrices  and  most  fibers  and  to  the  potential  for  prop¬ 
erty  mismatch  between  the  components.  Because  of  this  inherent  sta¬ 
tistical  variability,  careful  statistical  analysis  of  composite  mate¬ 
rial  properties  is  not  only  more  Important  but  is  also  more  complex 
than  for  traditional  structures. 

This  paper  addresses  this  issue  by  discussing  the  methodologies 
and  their  sequence  of  applications  for  obtaining  statistical  material 
property  values  (basis  values).  A  more  detailed  analysis  showing  the 
various  operations  required  for  computation  of  the  basis  value  is 
presented  by  the  authors  in  the  statistics  chapter  of  the  MIL-17  Hand¬ 
book  (ref.  1).  The  procedures  in  this  handbook  required  substantial 
research  efforts  in  order  to  accommodate  various  requirements  (eg. 
small  samples,  batch  to  batch  variability,  and  tolerance  limits)  for 
obtaining  the  basis  values.  Guidance  in  selection  of  the  methodology 
came  from  the  needs  of  the  military,  aircraft  industry,  and  the  Fed¬ 
eral  Aviation  Administration  ( FAA ) .  Some  of  the  procedures  include 
determination  of  outliers,  selection  of  statistical  models,  tests  for 
batch  to  batch  variation,  single  and  multi-batch  models  for  basis 
value  computation  and  nonparametric  methods.  In  figure  1,  a  flowchart 
is  shown  outlining  the  sequence  of  operations. 


An  important  application  of  the  basis  property  value  is  to  the 
design  of  composite  aircraft 'structures  where  a  design  allowable  is 
developed  from  this  value.  The  process  usually  Involves  a  reduction 
in  the  basis  values  in  order  to  represent  a  specific  application  of 
the  composite  material  in  a  structure  (for  example,  a  structure  with  a 
bolt  hole  for  a  particular  test  and  environmental  condition).  One 
common  approach  in  the  design  process  requires  the  design  allowable  be 
divided  by  the  maximum  applied  stress  or  strain  and  the  result  to  be 
greater  than  one.  The  basis  value  is  also  used  in  qualifying  new 
composite  material  systems  to  be  used  in  the  manufacture  of  aircraft. 
In  this  case,  the  values  are  obtained  from  an  extensive  test  matrix 
including  both  loading  and  environmental  conditions.  The  value  also 
provides  guidance  in  selecting  material  systems  for  specific  design 
requirements . 

The  paper  also  shows  how  material  strength  variability  and  the 
number  of  test  specimens  can  effect  the  determination  of  reliability 
numbers.  Methods  are  presented  for  obtaining  protection  against  this 
situation  by  providing  a  tolerance  limit  value  on  a  stress  correspond^ 
ing  to  a  high  reliability.  A  comparison  between  deterministic  and 
statistical  reliability  estimates  demonstrates  the  inadequacy  of  the 
deterministic  approach.  A  case  study  is  presented  describing  the 
recommended  procedures  outlined  in  the  MIL-17  Handbook  for  determining 
statistically  based  material  property  values. 


RELIABILITY  ESTIMATES 


Sample  Size  -  Variability 

The  importance  of  determining  a  tolerance  limit  on  a  percentile 
value  is  graphically  displayed  in  figures  2  and  3.  The  cumulative 
distribution  function  (CDF)  of  the  standard  normal  (mean  equals  0, 
standard  deviation  1)  is  plotted  for  sample  sizes  of  10  and  50,  using 
25  randomly  selected  sets  of  data.  In  figure  2,  for  n  equals  10,  the 
spread  in  the  percentile  is  2.1  for  the  10th  percentile.  In  figure  3, 
for  n  equals  50,  the  spread  is  .7  for  the  same  percentile.  The 
results  show  the  relative  uncertainty  associated  with  small  sample 
sizes  when  computing  reliability  values.  The  range  in  the  percentile 
can  also  depend  on  the  amount  of  variability  in  the  data  (i.e.,  the 
variance) 

Often  in  structural  design,  a  design  allowable  value  is  obtained 
from  the  basis  value.  A  design  allowable  is  an  experimentally  deter¬ 
mined  acceptable  stress  value  for  a  material  (called  an  allowable 
stress).  The  allowable  is  a  function  of  the  material  basis  value, 
layup,  damage  tolerance,  open  holes,  and  other  factors.  It  is  usually 
numerically  determined  for  some  critical  stress  region  located  within 
the  structure.  In  using  the  allowable  it  is  required  that  the  criti¬ 
cal  stress  be  less  than  a  proportion  (margin  of  safety)  of  the  allowa¬ 
ble  stress  value.  Determining  a  property  value  from  only  10  strength 
tests  using  90%  rel  ability  estimates  without  confidence  in  the  asser¬ 
tion  could  result  in  a  nonconservative  design  situation.  In  order  to 
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prevent  this  occurrence  and  provide  a  guarantee  of  the  reliability 
value ,  a  tolerance  limit  (i.e.  a  lower  confidence  bound)  on  the  per¬ 
centile  is  recommended.  The  MIL-17  Handbook  statistics  chapter 
describes  methods  for  obtaining  basis  values  for  a  prescribed  toler¬ 
ance  limit. 


Definition  of  the  B-Basis  Value 

The  B-basis  value  is  a  random  variable  where  an  observed  basis 
value  from  a  sample  (data  set)  will  be  less  than  the  10th  percentile 
of  the  population  with  a  probability  of  .95.  Zn  figures  4  and  5  a 
graphical  display  is  shown  of  the  basis  value  probability  density 
functions  for  random  samples  of  n  equals  10  and  50  respectively. 
Samples  are  from  the  same  population  as  in  figures  2  and  3.  The 
vertical  dotted  lines  represent  the  location  of  the  population  10th 
percentile  (X  .*).  The  probability  density  function  of  the  population 
is  also  displayed  in  the  figures.  Note  that  951  of  the  time  the  basis 
value  is  less  than  X  ...  The  graphical  display  of  the  basis  value 
density  function  sho$8umuch  less  dispersion  for  n  equals  50  than  for  n 
equals  10;  therefore,  small  sample  sizes  often  result  in  very  conser¬ 
vative  estimates  of  the  basis  value. 


STATISTICAL  METHODS  -  MATERIAL  PROPERTY  VALUES 


Flowchart  Guidelines 

Since  the  statistical  procedures  and  the  flowchart  (figure  1) 
have  been  published  in  the  MIL-17  Handbook  (ref.  1)  and  (ref.  2).  this 
paper  will  only  present  a  brief  description  of  the  methods,  their 
purpose,  interpretation  of  results,  and  the  need  for  following  the 
order  of  application  suggested  by  the  flowchart.  The  authors  have 
written  a  computer  code  which  performs  the  necessary  computations  for 
obtaining  the  basis  values  as  described  in  the  flowchart.  The  code  is 
available  on  a  diskette,  which  can  be  used  on  various  computers 
including  PC'b  that  are  IBM  compatible.  Both  the  executable  and 
source  code  are  on  the  diskette.  This  code  is  available  free  of 
charge  from  the  authors.  The  flowchart  capability  was  tested  by 
applying  the  recommended  procedures  using  both  real  and  simulated  data 
sets.  The  results  of  the  simulations  showed  at  least  95%  of  computed 
values  were  less  than  the  known  10%  point,  this  is  consistent  with  the 
definitions  of  'B'-basis  value,  see  also  (refs.  1  and  2). 

The  flowchart  has  two  directions  of  operations,  one  is  for  the 
single  batch  (sample),  and  the  other  is  for  the  multi-batch  case.  A 
batch  could  represent  specimens  made  from  a  manufactured  sheet  of 
composite  material  representing  a  roll  of  prepreg  material.  Published 
MIL-17  Handbook  basis  values  are  usually  obtained  from  five  batches  of 
six  specimens  each. 

Initially,  let  us  assume  the  user  of  the  flowchart  has  only  a 
single  batch  or  more  than  one  batch  but  that  the  batches  can  be  pooled 
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so  that  a  single  sample  analysis  can  be  applied.  The  first  operation 
(see  figure  1)  is  to  determine  if  outliers  exist  in  the  data  set.  A 
more  detailed  discussion  of  outlier  detection  schemes  and  applications 
are  published  in  ref.  3.  The  method  selected  is  called  the  Maximum  < 
Normed  Residual  (MNR)  procedure  (ref.  4)  and  is  published  in  the 
MIL-17  Handbook.  It  is  simple  to  apply  and  performs  reasonably  well 
even  though  it  assumes  that  the  data  is  from  a  symmetric  distribution. 
The  analysis  requires  obtaining  an  ordered  array  of  normed  residuals 
written  as 

NR^  *  ( x.^  “  x)/s,  iBl »  n  (1)  ( 

where  x  is  the  mean,  s  is  the  standard  deviation  (SD),  and  n  is  the  j 
sample  size.  If  the  maximum  absolute  value  of  NR,  (MNR)  is  less  than 
some  critical  value  (CV)  (see  refs.  1  and  2),  then  no  outliers  exist. 

If  MNR  is  greater  than  CV,  then  an  outlier  X  is  determined  from  the  i 
largest  NR^  value. 

Outlying  test  results  are  substantially  different  from  the  pri¬ 
mary  data.  For  example,  assume  that  the  data  set  contains  16  strength  ' 
values  and  15  range  from  150  to  200  KSI  while  the  other  is  80  KSI. 

The  MNR  method  would  identify  the  80  KSI  value  to  be  an  outlier.  The  j 
80  KSI  specimen  should  be  examined  for  problems  in  fabrication  and  I 
testing.  If  a  rationale  is  determined  for  rejecting  this  test  result,  • 
then  do  not  include  the  outlying  test  value  in  the  data  set  when 
obtaining  the  basis  value.  If  there  is  no  rationale  for  rejection, 
the  outlier  should  remain  unless  the  test  engineer  believes  that  a 
non-detectable  error  exists. 

It  is  important  to  identify  the  existence  of  outliers  but  also  of 
equal  importance  to  resist  removing  the  values  unless  a  rationale  has 
been  established.  Leaving  in  or  arbitrary  removal  of  outlying  values 
can  adversely  effect  the  statistical  model  selection  process  and 
consequently  the  basis  value  computation.  An  outlier  in  a  data  set 
will  usually  result  in  a  larger  variance  and  a  possible  shift  in  the 
mean  when  compared  with  the  same  data  without  the  outlier.  The  amount 
of  shift  and  the  variance  increase  depends  on  the  severity  of  the 
outlier  (distance  removed  from  the  primary  data  set).  It  is  suggested 
that  for  small  samples  (n  is  less  than  20)  critical  values  correspond¬ 
ing  to  a  101  significance  lsvel  be  used  (see  refs.  1  and  2)  in  order 
to  identify  outlying  values.  If  the  sample  is  greater  than  20,  then 
use  the  51  level.  It  is  often  difficult  to  test  for  outliers  when 
there  is  a  limited  amount  of  data;  therefore,  the  lot  level  will 
provide  additional  power  to  detect  outliers.  This  level  will  also 
result  in  more  chance  of  incorrectly  identifying  outliers.  Outliers 
can  be  incorrectly  identified  from  data  sets  with  highly  skewed  dis¬ 
tributions;  therefore,  it  is  suggested  the  box-plot  method  (refs.  1 
and  3)  be  applied  for  determining  outliers  in  this  situation. 
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Goodness  of  Fit  Test  -  Distribution  Function 


Referring  to  figure  1,  the  next  step  is  to  identify  an  acceptable 
model  for  representing  the  data.  In  the  order  of  preference  the  three 
candidate  models  are  Weibull,  normal,  and  the  nonparametric  method. 

The  Weibull  model  is 

Fw(x)  ■  1  -  expMx/a)*3]  ,  where  (2) 

x  is  greater  than  0,  <?'  is  the  scale  parameter,  and  $  is  the  shape 
parameter,  is  considered  first  in  the  ordering  of  the  test  procedures. 
The  Anderson-Darling  (AD)  goodness-of-f it  test  statistic  (refs.  1  and 
5),  is  suggested  for  identifying  the  model  because  it  emphasizes 
discrepancies  in  the  tail  regions  between  the  cumulative  distribution 
function  of  the  data  and  the  cumulative  distribution  function  of  the 
model.  This  is  more  desirable  than  evaluating  the  distributional 
assumptions  near  the  mean  since  reliability  estimates  are  usually 
measured  in  the  tail  regions.  The  Anderson-Darling  test  statistic  and 
the  observed  significance  levels  computations  are  described  in  refs.  1 
and  2.  Example  problems  are  also  shown  in  ref.  1,  demonstrating 
computational  procedures  for  applying  the  AD  method. 

In  following  the  flowchart,  if  the  Weibull  model  hasn't  been 
accepted  as  a  desired  model,  then  a  test  for  the  normal  distribution 
is  suggested, 

Fn(x)  ■  - r7v  fexp[-(t-u)2/2c2]dt  (3) 

N  o( 2n) 1/2 


2 

where  M  is  the  mean,  and  u  is  the  variance.  The  AD  test  for  the 
normal  model  is  similar  to  the  test  for  the  Weibull.  The  procedure 
used  to  identify  the  normal  model  is  also  in  refs.  1  and  2.  It  should 
be  noted  that  for  small  samples  reliable  identification  of  a  model  to 
represent  the  data  is  difficult  unless  some  prior  information  of  the 
population  is  known. 

If  the  Weibull  and  normal  models  are  rejected,  then  a  nonparamet¬ 
ric  method  can  be  used  to  compute  the  basis  value  (see  flowchart). 

This  method  does  not  assume  any  parametric  distribution  as  described 
above.  Therefore,  model  identification  is  not  required,  although 
application  of  the  method  can  often  result  in  overly  conservative 
estimates  for  the  basis  value. 


The  conventional  nonparametric  method  (ref.  6)  requires  a  minimum 
of  29  values  in  order  to  obtain  a  'B'-basis  value,  and  300  are  needed 
for  the  'A'-basis  number.  This  paper  presents  a  method  for  obtaining 
'A'  and  ' B '  basis  values  for  any  sample  size.  The  method  is  a  modifi¬ 
cation  of  the  ref.  7  procedure  involving  the  ordered  data  values 
arranged  from  least  to  largest  with  the  basis  value  defined  as 


B 


X(r)  -  K<X(r) 


“  X(l) 1 ' 


(4) 
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.th 


where  X,  »  if  r'  ordered  value  and  X , n  if  the  first  ordered  number. 
In  refs;  1  and  2  tables  for  r  and  K  values  are  tabulated  for  sample 
sizes  n.  Note,  in  the  case  where  'A'  values  are  required  for  small 
sample  sizes,  it  is  suggested  that  nonparametric  methods  be  applied 
unless  some  prior  information  of  the  model  is  known.  This  is  because 
of  the  limited  information  available  in  the  lower  tail  region  of  the 
distribution,  which  can  result  in  erroneous  estimates  of  the  reliabil¬ 
ity  numbers.  The  'A* -basis  value  is  often  used  in  design  where  a 
single  load  path  exists;  therefore,  it  is  essential  that  the  value  be 
conservative . 


Weibull  Method  -  ' B * -Basis  Value 

Returning  to  the  sequence  of  operations  as  outlined  in  the  flow¬ 
chart,  if  the  Weibull  model  is  accepted,  then  determine  the  basis 
value  from  the  following  relationship 


B  -  a[ln(l/PB)  ]1/(3  (5) 

where  $  and  Si  are  maximum  likelihood  estimates  of  the  shape  Q  and 
scale  a  of  the  Weibull  distribution.  That  is,  these  estimates  maxi¬ 
mize  the  likelihood  function,  which  is  the  product  of  probability 
densities  (2)  evaluated  at  each  of  the  n  data  values.  Tables  for  PB  A 
as  a  function  of  the  sample  size  n  and  the  code  for  determining  &  and  8 
are  given  in  refs.  2  and  3. 


Normal  Method  -  ' B * -Basis 

If  the  Weibull  model  was  rejected  and  the  normal  model  is  an 
acceptable  representation  of  the  data,  then  compute  tht  basis  value  as 

B  -  X  -  KbS  (6) 

where  )?  and  S  are  the  mean  and  SD,  and  K„  is  obtained  from  tables  in 
refs.  1  and  2. 


PROCEDURES  FOR  MULTIPLE  BATCHES 


Anderson-Darling  Test 

If  there  are  more  than  one  batch  of  data  being  analyzed,  then  a 
significance  test  is  required  in  order  to  determine  if  the  batches  may 
be  pooled  or  if  a  multi-batch  statistical  analysis  is  to  be  applied 
(see  flowchart).  Note,  the  outlier  test  is  to  be  applied  to  pooled 
data  prior  to  testing.  The  recommended  test  is  the  K-Sample  Anderson- 
Darling  Test  (refs.  1  and  8)  which  determines  if  batch  to  batch  varia¬ 
bility  exists  among  the  K  batches.  This  test  is  similar  to  the  AD 
test  for  identifying  acceptable  statistical  models  for  representing 
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data.  In  the  K  a amp la  case,  paired  comparisons  art  made  for  the 
empirical  CDF's  while  the  other  AD  methods  compare  a  parametric  CDF 
with  an  empirical  CDF.  In  all  cases*  this  comparison  involves  the 
integration  of  the  squared  difference,  off  the  CDF's  weighted  in  the 
tail  region  of  the  distribution.  The  K-sample  AD  is  basically  a  two 
sample  test  in  that  each  sample  (itn  batch)  is  individually  compared 
with  the  pooled  K-l  other  batches,  repeated  K  times  until  each  ltn 
batch  has  been  compared.  The  average  of  these  K  two-sample  tests 
determines  the  K-sample  AD  test  statistic.  Tables  of  critical  values 
and  a  detailed  description  of  the  method  and  its  application  is  shown 
in  refs.  1,  2,  and  8. 

If  a  significant  difference  is  noted  among  the  K  batches,  then, 
as  shown  in  the  flowchart,  a  test  for  equality  of  variance  is  sug¬ 
gested  using  a  method  in  ref.  9.  Application  of  the  method,  tables, 
and  the  necessary  relationships  for  computing  the  test  statistic  are 
given  in  refs.  1  and  2.  The  variance  test  is  suggested  only  as  a 
diagnostic  tool.  Sample  test  results  that  have  large  variances  rela¬ 
tive  to  the  other  batches  may  identify  possible  problems  in  testing  or 
manufacturing  of  the  specimens.  Equality  of  variance  is  not  required 
when  applying  the  Modified  Lemon  method,  as  discussed  below,  in  the 
multi-batch  case.  Although  the  Modified  Lemon  method  is  based  on  the 
assumptions  of  equality  of  variance  and  normality,  simulation  results 
have  shown  that  these  assumptions  are  not  necessary.  After  testing 
for  equality  variance,  it  is  suggested  that  the  basis  value  be 
obtained  from  application  of  the  Modified  Lemon' method  (see  figure  1). 


The  Modified  Lemon  Method 

Composite  materials  typically  exhibit  considerable  variability  in 
strength  from  batch  to  batch.  Because  of  this  variability,  one  should 
not  indiscriminately  pool  data  across  batches  and  apply  single  batch 
procedures.  The  K-sample  Anderson-Darling  test  was  introduced  into 
the  MIL-17  Handbook  in  order  to  prevent  the  pooling  of  data  in  situa¬ 
tions  where  significant  variability  exists  between  batches.  For  the 
situation  where  the  K-sample  Anderson-Darling  test  indicates  that 
batches  should  remain  distinct,  a  special  basis  value  procedure  has 
been  provided.  This  method,  referred  to  as  the  'ANOVA'  or  'Modified 
Lemon*  method,  will  be  discussed  next.  A  detailed  description  for 
applying  the  method  is  shown  in  refs.  1  and  2.  For  a  discussion  of 
the  underlying  theory,  see  ref.  10,  the  original  Lemon  paper,  and  ref. 
11,  the  Mee  and  Owen  paper  which  modifies  the  Lemon  method. 

The  Modified  Lemon  method  considers  each  strength  measurement  to 
be  a  sum  of  three  parts.  The  first  part  is  an  unknown  constant  mean. 
If  one  were  to  produce  batches  endlessly,  breaking  specimens  from  each 
batch,  the  average  of  all  of  these  measurements  would  approach  this 
unknown  constant  in  the  limit  of  infinitely  many  batches.  Imagine, 
however,  that  one  were  to  test  many  specimens  from  a  single  batch. 

The  average  strength  approaches  a  constant  in  this  situation  as  well, 
but  this  constant  will  not  be  the  same  as  for  the  case  where  each 
specimen  came  from  a  different  batch.  The  average  converges  to  an 
overall  population  mean  (a  'grand  mean')  in  the  first  case,  while  the 
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average  converges  to  the  population  mean  for  a  particular  batch  in  the 
second  case.  The  difference  between  the  overall  population  mean  and 
the  population  mean  for  a  particular  batch  is  the  second  component  of 
a  strength  measurement.  Thils  difference  is  a  random  quantity  -  it 
will  vary  from  batch  to  batch  in  an  unsystematic  way.  We  assume  that 
this  random  variable  has  a  normal  distribution  with  a  mean  of  zero  and 
some  unknown  variance  which  we  refer  to  as  the  between  batch  component 
of  variance.  Finally,  in  order  to  arrive  at  the  value  of  a  particular 
strength  measurement,  we  must  add  to  the  sum  of  the  constant  overall 
mean  and  a  random  shift  due  to  the  present  batch  a  third  component. 
This  is  another  random  component  which  differs  for  each  specimen  in 
each  batch.  It  represents  variability  about  the  batch  mean.  It  also 
is  assumed  to  have  a  normal  distribution  with  a  mean  of  zero  and  an 
unknown  variance,  which  is  referred  to  as  the  'within  batch*  component 
of  variance. 

The  'Modified  Lemon'  method  uses  the  data  from  several  batches  to 
determine  a  material  basis  property  value  which  provides  951  confi¬ 
dence  on  the  appropriate  percentile  of  a  randomly  chosen  observation 
from  a  randomly  chosen  future  batch.  This  basis  property  provides 
protection  against  the  possibility  of  botch-to-batch  variability 
resulting  in  future  batches  which  have  lower  mean  strength  than  those 
batches  for  which  data  are  available. 

To  see  what  this  means,  imagine  that  several  batches  have  been 
tested  and  that  this  statistical  procedure  has  been  applied  to  provide 
a  'B' -basis  value.  Now,  imagine  that  you  were  to  get  another  batch 
and  test  a  specimen  from  it.  After  this  you  obtained  still  another 
batch  and  tested  a  specimen  from  it.  If  you  were  to  repeat  this 
process  for  infinitely  many  future  batches,  you  would  obtain  a  distri¬ 
bution  of  strength  measurements  corresponding  to  a  randomly  chosen 
measurement  from  a  random  batch.  You  can  be  95«  certain  that  the 
basis  value  which  you  calculated  originally  is  less  than  the  tenth 
percentile  of  this  hypothetical  population  of  future  measurements. 

This  is  the  primary  reason  why  the  Modified  Lemon  method  is  advocated 
by  the  MIL-17  Handbook  -  it  provides  protection  against  variability 
between  batches  which  will  be  made  in  the  future  through  the  use  of 
data  which  is  presently  available. 

An  illustrative  example  of  this  method  applied  to  nine  batches  of 
material  is  shown  below.  The  data  sets  did  not  pass  the  K-sample  AD 


test 

for  pooling 

.  Let 

the  batches  be 

1 

2 

3 

4 

5 

6 

7 

8 

9 

61.3 

66.5 

66.0 

61.9 

68.9 

75.8 

72.8 

71.9 

68.7 

68.5 

64.7 

72.7 

68.0 

65.0 

75.2 

75.0 

71.0 

76.3 

62.5 

64.9 

67.1 

63.3 

70.9 

71.5' 

66.3 

69.5 

76.6 

66.0 

65.2 

67.7 

74.6 

65.4 

69.6 

69.5 

69.5 

66.2 

66.6 

64.8 

69.5 

70.3 

65.7 

66.2 

68.2 

69.1 

66.5 

64.9 

66.1 

71.9 

72.6 

74.6 

72.4 

72.8 

109.6 

26 


with  a  single  outlier,  109.6  determined  from  MNR  method.  Let's  assume 
109.6  was  an  incorrect  test  result  and  replaced  by  69.6,  a  corrected 
test  value. 


After  a  substantial  amount  of  computation  (see  refs.  1  and  2) 
involving  sums  of  squares,  within  batch  and  between  batch  variances, 
non-central  t  distribution,  etc.,  the  'B'-basis  value  is 

• B 1  -  60.93 

The  summary  statistics  are 
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It  should  be  noted  the  value  of  60.93  is  lower  than  61.9  of  nonpara- 
metric  solution  from  the  pooled  sample,  The  Modified  Lemon  method  can 
be  overly  conservative  (low  basis  values)  in  order  to  guarantee  90% 
reliability  with  95%  confidence.  The  number  of  batches  and  the  varia¬ 
bility  between  and  within  the  batches  effect  the  computation  of  the 
basis  value.  If  there  are  few  batches  and  large  between  batch  varia¬ 
bility  with  small  within  batch  variability,  then  this  situation  could 
result  in  very  low  basis  numbers  depending  on  the  amount  of  variabil¬ 
ity  and  number  of  batches. 

In  figure  6  results  from  application  of  flowchart  procedures  are 
shown  for  three  batches  of  five  specimens  of  AS4/Epoxy  material  tested 
in  compression.  In  this  case,  the  mean  strength  values  show  a  small 
amount  of  variability  while  there  is  a  relatively  large  spread  within 
each  data  set.  'B'-basis  results  from  the  flowchart  application  are 
for  the  followingi  ANOVA  (Modified  Lemon),  Kelbull,  Normal,  Lognor¬ 
mal,  and  nonparametric  methods.  Not  included  in  the  flowchart  results 
are  a  list  of  assumptions  that  were  violated.  The  results  show  a 
small  difference  in  basis  values  except  for  the  nonparametric  solution 
which  has  the  low  value  of  167.1.  The  Weibull  method  was  suggested 
since  it  passed  the  K-sample  AD  test  and  the  AD  goodness-of-f it  test. 
The  relatively  large  within  batch  variances  and  small  differences  in 
mean  values  made  it  poasible  to  pool  the  batches. 

Figure  7  shows  another  result  of  computing  the  'B'-basis  values 
using  the  ANOVA,  Weibull,  and  normal  methods  applied  to  another  three 
selected  batches  from  same  population  as  in  figure  6.  The  ANOVA 
result  of  15,7  KSI  is  substantially  lower  than  those  from  the  other 
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two  methods.  Unfortunately,  this  is  a  result  of  a  large  difference  in 
mean  values  preventing  pooling  of  the  batches  resulting  in  the 
required  ANOVA  application.  The  large  difference  in  mean  values  in 
addition  to  relatively  small  within  batuh  variability  resulted  in  this 
extremely  low  basis  value.  A  'B'  value  of  6.5  was  obtained  from  the 
simple  normal  analysis  using  the  three  mean  values.  The  result  shows 
that  for  this  example  the  ANOVA  method  primarily  depends  on  the  batch 
means.  The  above  results  would  suggest  obtaining  more  batches  or 
investigating  testing  and  processing  procedures. 

In  figure  8,  results  are  shown  for  the  case  of  randomly  selecting 
another  batch  from  the  same  population  described  in  figure  7.  In  this 
case  the  ANOVA  result  shows  a  value  of  105.4  KS1  which  is  substan¬ 
tially  larger  than  the  15.7  RSI  recorded  for  the  three  batches.  The 
importance  in  having  a  larger  number  of  batches  is  shown  from  these 
results  in  figures  7  and  8.  Also,  with  more  data  available,  the 
pooled  results  for  Weibull  and  Normal  model  also  resulted  in  less 
conservative  values. 

Figure  9  presents  results  showing  where  a  substantial  amount  of 
within  batch  data  is  not  necessary.  In  case  1,  the  ANOVA  results  for 
three  batches  of  100  data  values  each,  resulted  in  154.9  RSI  while  for 
case  2,  three  batches  of  ten  each,  a  'B'-basis  value  of  152  KSI  was 
obtained.  This  result  emphasizes  the  importance  of  being  able  to 
obtain  more  batches  rather  than  increasing  the  batch  size.  However, 
the  ANOVA  results  in  figure  6  show  three  batches  can  provide  reasona¬ 
ble  results  similar  to  pooled  results  if  small  differences  in  mean 
values  relative  to  batch  variances  exist.  Note  that  for  very  large 
batch  sizes,  the  K-sample  AD  test  can  reject  pooling  of  data  even 
though  there  is  a  small  difference  in  mean  values.  This  rejection  is 
statistically  correct,  but  the  user  of  the  flowchart  may  consider  the 
difference  in  the  batch  means  not  of  engineering  importance.  In  this 
case  the  user  can  make  the  decision  of  pooling  or  not  pooling,  since 
there  will  be  a  small  difference  in  basis  values  from  pooled  or 
unpooled  results.  If  there  are  large  batch  differences  and  the  ANOVA 
method  is  suggested  from  the  flowchart,  then  adding  more  batches  can 
reduce  the  conservatism.  The  ANOVA  method  is  a  random  effects  model 
which  determines  a  basis  value  representing  all  future  values  obtained 
from  the  same  material  system  and  type  of  test.  In  order  to  provide 
this  guarantee  in  the  presence  of  large  batch  to  batch  variability, 
there  is  the  potential  for  it  to  be  overly  conservative  which  was 
shown  in  figure  7. 


Reliability  at  Basis  Stress  Value 

Figure  10  conceptually  describes  the  statistical  reliability  of  a 
simple  structure  in  tension  as  it  relates  to  the  'B'-basis  applied 
stress  value.  In  the  example  shown  in  the  figure,  ten  percent  of  all 
the  specimens  (structures)  will  fail  when  subjected  to  load  S.  This 
statement  should  be  incorrect  at  most  one  time  in  twenty  (95%  confi¬ 
dence).  S  is  the  'B'-basis  value  obtained  from  strength  (failure 
load)  measurements  from  specimens  of  similar  material  and  geometry. 
This  statistical  guarantee  that  at  most  101  of  the  specimens  will  fail 
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can  provide  the  engineer  with  a  quantitative  number  for  selecting  and 
applying  material  in  composite  material  structures.  This  is  unlike 
the  conventional  deterministic  property  value  approach  which  is  an  ad 
hoc  procedure  that  reduces  the  mean  strength  measurements  in  order  to 
obtain  some  design  value  which  can  result  in  a  potentially  over  or 
under  design  situation.  In  applying  the  statistical  basis  value,  it 
is  assumed  the  material,  geometry,  and  loading  conditions  in  the 
structural  design  situation  is  similar  to  those  obtained  from  the 
strength  measurements.  This  is  also  true  for  deterministic  property 
value  applications.  In  the  following  sections  the  inadequacies  of  the 
deterministic  approach  are  discussed  in  more  detail. 


Reliability  values  Statistical  vs.  Deterministic 

In  figure  11  the  results  of  a  simulation  process  involving  the 
random  selection  of  ten  values  from  population  of  191  strength  meas¬ 
urements  repeated  2,500  times  are  graphically  displayed.  For  each 
simulation  a  design  number  or__material  property  value  is  obtained  from 
each  of  the  three  procedures  X/2,  £2/3 )X,  and  the  MIL-17  flowchart. 

The  mean  value  of  the  data  set  is  X.  The  reliability  values,  as  shown 
in  the  figure,  are  obtained  by  evaluating  the  population  probability 
distribution  fit  to  the  191  values  at  the  design  numbers. 

In  the  case  where  the  mean  is  reduced  by  a  factor  of  1/2,  the 
strength  values  are  very  low  (90  KSI),  and  the  reliability  is 
extremely  high  (1.0).  The  engineer  may  not  be  able  to  afford  such  a 
high  reliability  value  of  1.0  (to  twenty  significant  digits)  at  the 
expense  of  having  design  values  as  low  as  90  KSI  when  mean  strength  is 
180  KSI.  The  factor  of  2/3  increases  the  design  value  but  reduces  the 
reliability  to  approximately  .999.  The  flowchart  'B'-basis  calcula¬ 
tion  provides  higher  strength  values  with  acceptable  reliability 
numbers.  The  other  two  procedures  show  an  element  of  uncertainty  by 
depending  on  the  chosen  factor.  If  the  engineer  used  the  factor  of 
1/2,  this  would  result  in  an  extremely  over  design  situation  require 
either  rejection  of  the  material  or  the  design.  Alternatively,  if  the 
engineer  used  the  mean  strength  as  design  number,  the  reliability 
would  be  reduced  to  .5,  although  strength  values  would  be  much  higher. 
The  flowchart  procedure  removes  the  uncertainty  by  providing  a  guaran¬ 
teed  minimum  reliability  of  .90  without  unnecessarily  reducing  the 
basis  value.  The  minimum  reliability  can  be  increased  to  .99  if 
necessary  by  using  'A'-basis  computations  as  outlined  in  the  MIL-17 
Handbook. 


Effect  of  Variance  on  Reliability  Estimates 

In  figure  12  the  effects  of  variance  differences  as  they  relate 
to  reliability  estimates  are  shown  from  a  simulation  process.  This 
involved  randomly  selecting  ten  values  from  each  of  two  separate 
normal  distributions  with  same  mean  of  100  and  different  SD's  of  5  and 
25  repeated  2,500  times.  The  reliability  values  are  obtained  in  a 
similar-  manner  as  described  in  the  previous  section,  except  the  proba¬ 
bility  values  were  obtained  from  the  normal  distribution.  In  the  case 
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where  the  SD  is  5,  there  is  very  littlejdispersion  in  the  reliability 
values.  Again ,  the  design  number  from  X/2  is  substantially  lower  than 
the  basis  value  using  the  flowchart  process,  although  the  reliability 
is  very  high  for  this  number.  In  comparing  this  with  the  results 
using  SD  of  25#  a  substantial  Increase  dispersion  of  the  reliability  I 
values  particularly  for  the  basis  results  using  flowchart  methods. 

The  flowchart  results  show  similar  reliability  estimates  for  both  SD's 
of  5  and  25#  although  for  the  X/2  the  reliability  has  been  reduced 
substantially  firom  twelve  nines  to  .96.  This  is  the  result  of  the 
deterministic  (X/2)  approach  being  independent  of  variance.  This  is 
not  an  issue  if  50%  reliability  is  required#  but  for  90%  reliability#  j 
variability  is  important.  Dividing  the  mean  by  two  can  be  nonconser-  ' 
vative  for  situation  when  the  distribution  has  a  large  spread  (long 
tail).  In  order  to  make  adjustment  for  this  situation#  the  flowchart 
method  (basis  value)  is  suggested.  See  results  in  the  figure  where 
the  basis  value  adjusts  to  a  lower  level  but  maintains  the  same  range 
for  the  reliability  estimates.  The  basis  value  will  guarantee  a 
reliability  by  adjusting  the  design  value  while  the  safety  factor 
approach  cannot  guarantee  reliability.  This  result  suggests  using  the 
basis  method  if  it  is  important  to  maintain  a  certain  level  of  relia¬ 
bility.  The  overall  issue  is  that  the  flowchart  methods  will  provide 
property  values  with  specified  reliability  with  95%  confidence  while 
the  deterministic  approach  is  an  ad  hoc  approach  with  no  control  of 
the  resulting  reliability  estimates. 


CONCLUSIONS 


This  paper  is  an  exposition  of  the  statistical  procedures 
described  in  the  MIL-17  Handbook  for  obtaining  material  proper  ty 
values.  Its  primary  goal  was  to  introduce  the  MIL-17  statistics 
chapter  to  the  users  so  that  they  may  use  it  more  effectively.  The 
methods  and  the  sequence  of  operations  suggested  by  the  statistics 
chapter  flowchart  were  analyzed  with  respect  to  their  effectiveness, 
purpose,  and  limitations.  By  following  the  flowchart  procedures, 
guidance  is  provided  to  the  user  so  that  reasonably  accurate  property 
values  may  be  obtained  without  relying  on  ad  hoc  schemes  which  could 
potentially  result  in  either  excessively  low  or  high  values. 

Each  method  and  its  order  of  application  were  discussed  with 
respect  to  their  specific  purpose,  such  as  model  identification,  batch 
to  batch  variability  recognition#  outlier  detection,  and  the  basis 
value  computation.  There  are  situations  where  low  basis  values  will 
result#  not  because  of  limitations  in  the  statistical  procedures  but 
are  usually  the  result  of  very  large  or  small  data  sets,  large  batch 
to  batch  variations#  or  model  recognition. 

The  comparison  between  the  statistical  reliability  and  the  deter¬ 
ministic  approach  showed  a  preference  for  statistics  since  it  was  able 
to  guarantee  a  specified  reliability  in  contrast  to  a  deterministic 
method  which  is  primarily  an  ad  hoc  process  resulting  in  considerable 
uncertainty  as  to  the  corresponding  reliability  estimates.  Finally, 
the  authors  have  attempted  to  provide  a  satisfactory  definition  of  a 
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statistically  based  material  property  value  by  introducing  the  toler¬ 
ance  limit  concept  and  its  importance.  A  number  of  illustrations  were 
presented  showing  the  advantage  of  the  tolerance  limit  over  the  deter¬ 
ministic  approach. 
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FIGURE  1  FLOW  CHART  ILLUSTRATING  COMPUTATIONAL  PROCEDURES 
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FIGURES  2  AND  3  SAMPLE  SIZE  EFFECT  ON  RELIABILITY 


Random  data  sets  of  size  n  from  a  NormaT  distribution 
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FIGURES  4  AND  6  BASIS  VALUE  PROB.  DENSITY  FUNC. 


FIGURE  6  EXAMPLE  OF  BASIS  VALUE  CALCULATION  FIGURE  7  EXAMPLE  OF  BASIS  VALUE  CALCULATION 
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FIGURE  8  EXAMPLE  OF  BASIS  VALUE  CALCULATION  FIGURE  9  THE  EFFECT  OF  INCREASED  BATCH  SIZE: 

THE  EFFECT  OF  AN  ADDITIONAL  BATCH  SUBSTANTIAL  BETWEEN-BATCH  VARIABILITY 
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FIGURE  11  RELIABILITY  /  STRENGTH  COMPARISON: 
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(T300/Epoxy  Unidirectional) 


FIGURE  12  RELIABILITY  /  STRENGTH  COMPARISON: 
A  CASE  STUOY  -  STAT.  VS  DETERMINISTIC 
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STATISTICAL  CULTURE:  PROMOTING  THE  PRACTICE  OF  STATISTICS 


Emanuel  Parzen 

Department  of  Statistics,  Texas  A&M  University 

Abstract 

This  paper  proposes  a  framework,  called  Statistical  Culture,  for  studying  the  practice 
of  statistics  with  the  aim  of  improving  the  health  of  statistical  science  as  measured  by  how 
well  citizens  and  scientists  use  it  as  a  tool  in  their  daily  life  and  research.  We  identify 
a  paradigm  for  lifelong  learning  based  on  identifying  five  (parallel,  non-hierarchial)  levels 
of  statistical  literacy:  consumer,  applier,  consultant,  collaborator,  theorist.  We  support 
accreditation  of  statistical  literacy.  W»  make  recommendations  for  how  statisticians  can 
promote  public  recognition  of  the  importance  of  statistics,  statistical  literacy,  and  interac¬ 
tion  between  researchers  and  statisticians.  We  propose  “solutions”  to  the  use  of  statistics  as 
a  scientific  method  by  research  which  aims  to  unify  and  guide  thinking  about  the  diversity 
of  statistical  methods  and  theories. 

Contents:  Statistical  Culture  as  a  Paradigm  for  Lifelong  Learning,  Solutions,  Prob¬ 
lems,  Levels,  Excellence,  Statistical  Culture  Levels  Theorem,  Olkin-Sacks  Report,  Statis¬ 
tical  Culture  Applications  Theorem,  Statistical  Culture  Research  Problems. 

KEYWORDS:  Foundations,  Teaching,  Statistical  Literacy,  Statistical  Science,  Unification 
of  Statistical  Methods,  Statistical  Culture. 
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STATISTICAL  CULTURE:  PROMOTING  THE  PRACTICE  OF  STATISTICS 

Emanuel  Parzen 

Department  of  Statistics,  Texas  A&M  University 

STATISTICAL  CULTURE  AS  A  PARADIGM  FOR  LIFE  LONG  LEARNING:  The 
health  of  a  society  is  becoming  increasingly  dependent  on  its  statistical  literacy,  and  how 
statistics  is  practiced.  Modern  society  is  data-rich  and  has  an  ever-increasing  need  to 
understand  how  data  becomes  information  (useable  knowledge).  The  goal  of  continuous 
improvements  of  quality  of  processes  Involved  in  the  delivery  of  products  or  services  requires 
that  decisions  be  based  on  the  information  in  data,  not  just  on  opinions  or  guesses;  this  is 
the  main  recommendation  of  the  philosophy  of  Ed  Doming  (see  Mann  (1988),  p.  IS). 

This  paper  proposes  that  the  practice  of  statistics  at  any  of  its  levelt  should  be  a  lifelong 
endeavor  characterized  by  the  features  that  are  being  advocated  as  the  requirements  of 
paradigms  for  lifelong  learning  that  will  be  required  in  the  21st  century  (according  to  John 
Sculley  (1989),  p.  1057): 

•  “It  should  require  rigorous  mastery  of  subject  matter  under  expert  guidance. 

•  It  should  hone  the  conceptual  skills  that  wrest  meaning  from  data, 

•  It  should  promote  a  healthy  skepticism  that  tests  reality  against  multiple  points 
of  view. 

•  It  should  nourish  individual  creativity  and  encourage  exploration. 

•  It  should  support  collaboration. 

•  It  should  reward  clear  communciation. 

•  It  should  provoke  a  journey  of  discovery. 

•  And  above  all  it  should  be  energized  by  the  opportunity  to  contribute  to  the  total 
of  what  we  know  and  what  we  can  do,” 

The  study  of  how  to  achieve  the  lifelong  learning  process  required  for  the  practice  of 
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statistics  is  called  “statistical  culture.” 

This  paper  seeks  to  show  the  important  role  of  “statistical  culture”  in  the  practice  of 
statistics.  It  supports  the  concept  of  accreditation  of  statistical  literacy  at  various  levels. 

The  challenge  for  statistical  education  will  be  to  find  ways  of  bringing  to  the  process 
of  instruction  the  passion  for  discovery  that  drives  excellent  statistical  thinking. 

SOLUTIONS:  Statistical  culture  (the  study  of  the  practice  of  statistics)  has  goals 
of  elegance  and  utility.  The  elegance  of  statistical  culture  is  obvious;  it  enhances  the 
fun  of  doing  statistics. The  utility  of  the  study  of  the  culture  of  statistics  is  to  motivate 
statistical  “steersmanship”,  developing  consensus  about  (and  implementing)  the  actions 
needed  for  continuously  evaluating  and  improving  the  health  of  the  discipline  and  profession 
of  statistics. 

Statistical  culture  can  be  said  to  be  the  study  of  the  maps  (geography,  current  history) 
of  statistics,  rather  than  its  ancient  history  (as  in  the  history  of  statistics  up  to  1900).  It 
is  the  study  of  the  maps  of  statistics  from  the  point  of  view  of  understanding  its  current 
state  of  the  art  and  influencing  its  future  development. 

Statistical  culture  can  be  defined  to  be  the  study  of: 
how  statistics  is,  and  ought  to  be,  practiced; 

where  statistics  has  applications  (see  Table  1)  and  who  is  doing  the  applying; 
what  to  teach  in  statistics  courses; 
why  statistics  works; 

when  are  competing  probability  models  and  statistical  methods  successful; 
accreditation  of  statistical  literacy  (rather  than  competency)  at  various  levels. 

To  promote  the  practice  of  statistics,  statistical  culture  seeks: 

1.  To  develop  maps  of  statistical  methods  which  will  help  applied  statisticians  to  strive 
for  continuous  improvement  of  methods,  to  learn  new  methods  to  consider  as  alter¬ 
natives,  to  compare  competing  methods,  to  more  confidently  obtain  conclusions  from 
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comparisons  of  the  results  of  competing  methods  of  statistical  data  analysis  of  data 
of  a  certain  type,  to  obtain  problem-driven  results  from  methods-driven  results,  to 
obtain  substantive  conclusions  from  data  for  which  prior  substantive  knowledge  was 
not  available, 

2.  To  develop  maps  of  statistical  theories  which  help  theoretical  statisticians  to  define 
frontiers  of  research  and  thus  understand  the  sense  and  purpose  of  research  which 
otherwise  may  seem  unfocused  and  unmotivated. 

3.  To  develop  maps  of  the  relations  between  statistics  and  other  fields  of  knowledge  and 
research  which  will  help  interactions  between  statisticians  and  researchers  in  other 
disciplines  provide  more  recognition  to  the  research  contributions  of  statisticians. 

4.  To  develop  maps  of  the  contributions  that  statistical  literacy  and  the  practice  of  statis¬ 
tics  can  make  to  a  nation’s  quality  of  life  and  world  competitiveness. 

5.  To  organize  (each  year,  in  each  community)  Statistical  Science  Awareness  Days  to 
promote  the  practice  of  statistics  and  public  recognition  of  outstanding  statisticians. 

Statistical  culture  (which  develops  unifications,  maps,  frameworks)  is  urgently  needed 
in  order  to  improve  the  image  of  statistics  among  scientists  and  professionals.  It  would 
provide  the  ability  to  objectively  recognize  by  suitable  awards  more  statisticians  as  “out¬ 
standing”  contributors  to  the  missions  of  their  organizations  as  well  as  to  the  discipline 
and  the  profession  of  statistics. 

Unification  of  methods  is  one  of  the  important  facets  of  the  use  of  the  scientific  method 
In  any  field  of  research  (and  therefore,  a  fortiori,  in  statistics).  Unification  of  statistical 
methods  does  not  prevent  statisticians  from  using  ad  hoc  solutions  (which  many  claim  is 
their  preferred  approach)  but  rather  encourages  and  guides  such  methods  by  clarifying 
the  methods  available  which  may  be  chosen  ad  hoc;  therefore  the  ultimate  goal  of  research 
(such  as  Parzen  (1989))  on  Grand  Unified  Theories  of  Statistical  Methods,  denoted  GUTS, 
is  “grand  unified  ad  hockery” . 
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PROBLEMS:  Statisticians  are  increasingly  aware  that  there  are  urgent  problems  in 
the  discipline  and  profession  of  statistics;  we  argue  that  problems  can  be  solved  if  they  are 
discussed  using  scientific  methods  and  a  framework  for  the  “culture"  of  statistical  prac¬ 
tice.  Examples  of  such  problems  are:  declining  enrollment  of  statistics  doctoral  students, 
difficulty  of  attracting  young  people  into  a  career  in  statistics,  teaching  statistics  to  engi¬ 
neers  (Penzias  (1989}),  misunderstanding  of  the  role  of  statisticians  in  quality  control  and 
quality  manufacturing  (Hahn  (1989)),  expressions  of  dissatisfaction  in  the  profession  of 
statistics  about  the  appreciation  and  utilization  of  statisticians  (Boroto  and  Zahn  (1989) 
and  McPherson  (1989)),  failure  of  leading  statisticians  to  continuously  promote  statistical 
culture  (to  be  providing  leadership  to  the  study  of  promoting  the  practice  of  statistics), 
failure  of  many  statisticians  to  be  literate  at  appropriate  levels  in  a  diversity  of  statistical 
methods  (Including  time  series  analysis). 

LEVELS:  We  believe  that  one  can  apply  the  scientific  method  to  the  study  of  statistical 
culture  (the  investigation  of  how  statistics  is,  and  ought  to  be,  practiced);  answers  to 
such  questions  should  not  be  based  on  prejudices  but  on  a  consensus  of  the  philosophical 
writings  of  successful  statisticians.  From  recent  literature  about  statistics  (Bodmer  (1985), 
McPherson  (1989))  one  can  conclude  the  following  first  step  in  drawing  a  map  of  the 
practice  of  statistics  (which  we  state  below  in  more  detail  as  the  Statistical  Culture  Levels 
Theorem). 

The  practice  of  statistics  occurs  at  three  levels  of  understanding  and  practice: 
popular, 

science-related  professionals,  and 
professional  statisticians; 

further  the  practice  of  statistics  by  statisticians  can  be  divided  into  three  levels: 
consulting 
collaboration 
theory  and  methods. 
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EXCELLENCE:  Statistical  culture  aims  to  provide  a  framework  which  stimulates 
statisticians  to  understand  and  applaud  each  other’s  work  (indeed,  there  seems  to  be  too 
much  joy  in  “statistician  bashing”);  this  may  be  a  general  failing  of  human  nature  but  it 
seems  to  be  an  urgent  problem  for  statistics.  The  use  of  the  word  “level”  should  not  be 
interpreted  as  implying  vertical  or  series  structure,  with  activity  in  statistical  theory  at 
the  top.  The  levels  form  a  horizontal  or  parallel  structure;  it  cannot  be  emphasized  enough 
that  the  understanding  required  in  each  level  involves  different  aspects  of  the  practice  and 
methods  of  statistics.  A  possible  analogy  is  the  saying:  “Use  the  talents  you  possess;  for 
the  woods  would  be  very  silent  if  no  birds  sang  except  the  best.” 

Statistical  culture  does  aim  to  support  the  search  for  excellence.  Criteria  should  be 
developed  to  rate  good  statistical  practice  as  either  average,  superior,  or  exceptional;  one 
criterion  is  whether  it  is  done  at  the  level  of  “what,”  “how,”  or  “why:\ 

STATISTICAL  CULTURE  LEVELS  THEOREM:  CONSUMER,  APPLIEIt, 
CONSULTANT,  COLLABORATION,  THEORV  AND  METHODS  DEVELOPMENT.  To 
promote  the  practice  of  statistics,  wo  propose  that  it  is  "seful  to  identify  five  levels  of 
practice,  defined  as  follows. 

I.  Statistical  consumer: 

knows  definitions  of  statistics; 

appreciates  the  concept  of  variability  (distribution  of  outcomes); 

has  the  ability  to  understand  statistical  models  and  graphical  presentations  of  data 

analysis; 

does  not  have  a  working  knowledge  of  statistical  methods  or  the  ability  to  carry  out 
a  statistical  analysis; 

appreciates  the  role  of  statisticians  in  the  battle  for  statistical  literacy  (competence  in 
understanding,  applying  and  advancing  statistical  reasoning). 

Statistical  literacy  at  the  consumer  level  can  be  defined  to  be  knowing  that  public 
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policy  should  be  based  on  answers  to  the  questions:  “What  can  happen?  What  are  the 
odds  (probabilities)?  How  do  you  know  the  odds?” 

II.  Statistical  applier.  Distinguish  two  levels: 

11(A).  knows  basic  statistical  methods  used  to  determine  and  obtain  needed  information; 

ability  to  use  menu  driven  statistical  computing  packages;  fits  all  problems  into  con¬ 
venient  routine  statistical  conceptualizations; 

11(B).  ability  to  use  command  driven  statistical  computing  environments; 

understands  the  assumptions  underlying  statistical  methods  and  can  adapt  statistical 
methods  to  provide  ad  hoc  methods  for  problems  at  hand; 

Scientists  and  engineers  involved  in  research  or  development  should  be  statistical  appliers; 
those  that  become  more  statistically  self-sufficient  can  become  more  responsible  to  be  their 
own  statistical  consultants. 

III.  Statistical  consultant: 

skilled  in  transforming  data  into  information; 

has  the  ability  to  examine  facts  and  serve  as  referees  of  statistical  analyses; 
aware  of  the  most  modern  statistical  methods; 

not  actively  involved  in  the  scientific  language  and  perspective  of  the  problems  being 
studied  so  that  conversation  between  client  or  customer  and  consulting  statistician  is 
less  a  dialogue  and  more  a  monologue; 

requires  abilities  to  interview  clients  to  obtain  an  understanding  of  their  problems, 
and  to  communicate  with  clients  by  oral  presentations  and  written  reports; 
often  advised  to  use  simple  techniques  for  scientists  unable  to  appreciate  subtleties  of 
statistics; 

helps  contribute  to  research  on  the  consulting  process. 

IV.  Statistical  collaborator: 

statistician  is  a  collaborator  on  the  project  and  is  a  catalyst  and  potential  advocate  of 
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actions  and  directions  to  be  pursued  in  the  project; 

collaborative  research  often  (if  not  always)  leads  to  joint  publications  and/or  joint 

research  grants; 

has  mathematical  training  adequate  to  understand  the  philosophy  and  vigor  of  statis- 

tical  methods  but  not  completely  the  rigorous  proof  of  their  theory; 

has  ethical,  administrative,  and  diplomatic  skills,  especially  those  required  for  large 

scale  and  long  term  research  projects; 

helps  contribute  to  research  on  the  collaboration  process. 

V.  Statistical  theorist: 

inevitably  mathematically  well  trained, 

seeks  to  develop  and  teach  the  logical  structure  of  statistical  methods,  to  understand 
how  they  are  born  and  how  they  die,  how  they  can  be  made  to  work  better  and  why 
they  work; 

basic  research  in  general  methods  that  provide  analogies  between  applications; 
fundamental  research  in  analogies  between  methods  (patterns  which  general  methods 
share  with  other  general  methods); 

mathematical  research  on  the  properties  of  statistical  methods  can  be  considered  an¬ 
other  level  within  the  theory  level. 

OLKIN-SACKS  1988  REPORT:  The  distinction  between  consulting  and  collaboration 
is  based  on  how  "equal”  the  statistician  is  regarded  as  a  member  of  the  research  team.  Olkin 
and  Sacks  (1988)  used  the  names  "advisory  collaboration”  and  "interactive  collaboration” 
(or  Type  A  and  Type  B)  for  what  we  call  “consulting"  and  “collaboration” .  We  quote  the 
report  (p.  12): 

“Typically,  the  statistician  engaged  in  advisory  work  will  adapt  existing  methodology 
to  the  problem  at  hand  and  create  computable  versions  of  known  techniques.  Another 
mode  of  collaboration  is  much  more  interactive  in  nature  and  involves  work  to  develop 
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novel  techniques  and  methods  to  deal  with  broader  substantive  questions.  This  second 
type  of  collaboration  leads  to  research  on  statistical  issues  that  may  subsequently  advance 
knowledge  both  in  the  substantive  held  and  in  statistics  itself. 

“The  survey  responses  indicated  a  high  frequency  of  Type  A  research,  while  sounding 
a  common  theme  that  Type  B  research  does  not  receive  sufficient  time,  money,  or  recog* 
nition  of  its  value.  The  short-run  ‘advisory  consultation’  rarely  becomes  the  ‘long-range 
interactive  collaboration.’  Yet  it  is  the  interactive  mode  that  has  the  greater  potential  to 
break  new  ground  and  lead  to  statistical  innovations  of  far-reaching  significance  for  the 
future  conduct  of  science,  and  it  is  this  type  of  collaboration  that  the  panel  feels  must 
receive  the  attention  of  the  disciplines  and  of  NSF  and  other  funding  agencies.” 

STATISTICAL  CULTURE  APPLICATIONS  THEOREM:  Another  map  required  to 
guide  the  practice  of  statistics,  called  a  Statistical  Culture  Applications  Theorem,  is  given 
in  Table  1  which  lists  disciplines  represented  in  cross-disciplinary  research  involving  col¬ 
laboration  by  faculty  members  in  “statistics  programs”  in  universities.  The  fields  and 
percentages  are  vaguely  adapted  fr  >m  Table  5  of  the  Olkln-Sacks  report.  The  conjectured 
percentages  are  intended  to  motivate  passsionate  discussions  (and,  eventually,  research). 
An  interesting  research  program  is  to  investigate  the  proportion  of  new  degrees  in  statistics 
that  take  employment  to  apply  statistics  in  each  discipline  listed  in  Table  1. 

The  interests  of  statisticians  may  also  be  studied  by  investigating  the  distribution  of 
1987  doctorates  among  broad  fields  of  statistics  (see  Cox,  Voytuk,  and  Hart  (1989)): 


Probability  and  Math  Stat 

143 

Biometrics  and  Biostatistics 

37 

Psychometrics 

9 

Econometrics 

25 

Social  Sciences  Statistics 

49 

TOTAL 

263 
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The  racial/ethnic  composition  of  mathematical  doctorate  degree  recipients  in  the  pe¬ 
riod  1975  to  1986  was  as  follows: 


White 

Black 

Hispanic 

Asian 

Math  Sciences,  total 

89.8% 

1.4% 

1.4% 

7,1% 

Prob  U  Math  Stat 

85.9% 

1.5% 

1.5% 

10.9% 

The  percentage  of  degrees  to  foreign  citizens  is  40%  in  statistics  and  45%  in  mathemat¬ 
ics.  The  percentage  of  math-science  doctorates  working  in  education  is  50%  for  statistics 
and  60%  for  mathematics;  25%  of  statistics  doctorates  are  university  faculty  members. 
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Table  1:  Discipline!  Where  Statistics  is  Applied 
Disciplines  Represented  in  Statistical  (and  Time  Series  Analysis) 
Cross-Disciplinary  Collaborative  Research 

(Conjectured  Percentage  of  Statisticians  in  Universities  Involved  in  Collaboration) 

Health  and  Life  Sciences  (25%,  25%) 

Medicine 

Public  Health  and  Epidemiology,  Biostatistics 

Biology 

Ecology 

Fisheries  and  Wildlife 

Environmental  Sciences 

Pharmacology  and  Toxiocology 

Genetics 

Entomology 

Forest  Science 

Physiology 

Engineering  and  Mathematical  Sciences  (15%) 

Engineering 

Computer  Sciences 

Operations  Research  and  Reliability 

Mathematics 

Signal  Processing 

Image  Analysis  and  Pattern  Recognition 
Industrial  Statistics 
Defense  Statistical  Standards 
Hydrology 

Behavioral  and  Social  Sciences  (15%) 

Psychology,  Cognitive  Sciences 

Economics,  Econometrics 

Education 

Sociology 

Political  Science 

Sample  Survey 

Government  Statistics 

Physical,  Chemical,  Earth  and  Atmospheric  Sciences  (10%) 

Chemistry,  Chemometrics 
Geology,  Geophysics 
Physics,  Astronomy, Chaos 
Meteorology 
Oceanography 

Agriculture  (4%) 

Animal  Science 
Soils  and  Crop  Sciences 
Agricultural  Economics 
Veterinary  Medicine 
Food  Science 

Business  Administration  (4%) 

Finance 

Forecasting 

Law  (2%) 
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STATISTICAL  CULTURE  RESEARCH  PROBLEMS: 


DEFINITIONS  OF  STATISTICS  AND  STATISTICAL  SCIENCE.  Is  a  suitable  defini¬ 
tion  of  statistics  (which  is  similar  to  that  of  McPherson  (1989),  p.  224)  “form  expectations, 
make  observations,  compare  observations  and  expectations,  continuously  improve"?  Is  a 
suitable  definition  of  statistical  science  “the  science  of  analyzing  data  by  varying  conditions 
(probability  models  and  estimation  criteria)  under  which  one  analyzes  a  data  set”  ?  Note 
that  laboratory  science  learns  about  a  phenomenon  by  varying  the  experiments  conducted 
to  generate  observations  about  the  phenomenon. 

EFFECTIVENESS  RANKING  OF  STATISTICS  PROGRAMS:  Statistics  programs 
in  U.  S.  universities  are  usually  ranked  by  their  contributions  to  research  in  statistical 
methods  and  theory,  Should  they  also  be  ranked  by  their  effectiveness  with  regard  to 
their  success  in  adding  to  the  U.  S.  work  force  new  degree  holders  (bachelors,  masters, 
doctorates)  who  have  received  education  to  practice  statistics  at  the  various  levels  we 
have  identified?  Should  we  regard  as  unsatisfactory  the  following  current  appropriate 
proportions  being  produced  on  the  average  in  the  U.  S. 


consumers  (pre-calculus  course) 

800/10000 

consumers  (post-calculus  course) 

200/10000 

appliers 

100/10000 

consultants 

10/10000 

collaborators 

4/10000 

theorists 

2/10000 

One  category  in  which  it  is  particularly  urgent  for  statistics  programs  to  increase  the 
number  of  students  is  consumer  (post-calculus)  courses  since  this  is  the  source  which 
supliea  candidates  for  all  other  levels  of  statistical  practice.  Desirable  goals  for  the  fraction 
of  students  in  introductory  courses  who  are  taking  a  course  with  calculus  prerequisite  is 
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30%. 


UNDERGRADUATE  EDUCATION:  Provide  students  with  a  grid  of  introductory 
courses  in  statistics  which  introduce  the  elegance  and  utility  of  statistical  thinking,  meet 
the  needs  for  training  at  various  levels  of  statistical  literacy,  are  appropriate  to  students’ 
scientific  interests  and  mathematical  backgrounds,  and  meet  the  goals  of  training  all  work* 
ers  to  become  statistically  literate  at  the  consumer  level,  and  many  researchers  to  become 
statistically  literate  at  the  applier  level. 

The  television  series  “Against  All  Odds”  provides  excellent  supplementary  material 
for  undergraduate  statistical  education.  An  exposure  to  the  methods  and  applications 
discussed  in  “Against  All  Odds”  can  be  defined  to  be  a  superior  grade  of  statistical  literacy 
at  the  consumer  level. 

GRADUATE  EDUCATION:  Design  graduate  education  in  statistics  to  successfully 
provide  training  at  each  level  of  the  practice  of  statistics,  and  which  educates  graduate 
students  to  have  broad  interests  in  applied,  theoretical,  and  computational  modern  statis¬ 
tics.  Students  should  have  available  courses  in  statistical  culture  which  expose  them  to 
the  role  played  by  statistical  methods  in  each  of  the  disciplines  listed  in  Table  1. 

One  of  the  important  expected  benefits  of  the  study  of  statistical  culture  is  to  help 
the  development  of  communication,  mutual  respect  and  cooperation  between  statisticians 
involved  with  various  levels  of  practice  of  statistics.  Graduate  students  in  statistics  come 
from  an  extreme  diversity  of  backgrounds.  The  study  of  statistical  culture  would  actively 
encourage  them  to  communicate  more  with  each  other  (as  well  as  with  their  faculty)  about 
the  expertise  which  they  should  acquire  as  students  and  also  during  their  careers.  Such 
discussions  should  be  part  of  the  graduate  curriculum  in  a  first  year  course  (which  could 
be  called  Statistical  Forum  or  Statistical  Culture)  which  would  also  help  students  decide 
about  whether  they  want  a  master’s  or  doctor’s  degree. 
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STATISTICS  AND  RELATED  FIELDS:  Identify  the  relations  between  statistics  and 
mathematics,  between  statistics  and  probability,  between  statistics  and  computing,  and 
between  statistics  and  the  design  of  scientific  investigations. 

STATISTICAL  VITALITY:  How  much  of  the  current  vitality  of  statistics  derives  from 
the  availabililty  of  jobs  in  industrial  statistics,  biostatistics,  and  environmental  statistics? 
Further,  how  do  these  areas  of  application  compare  with  regard  to  the  comparative  devel¬ 
opment  of  the  various  levels  of  statistical  practice? 

THE  URGENT  NEED  FOR  MERGERS  OF  STATISTICIANS! 

Statisticians  in  the  United  Kingdom  are  currently  calling  for  a  more  unified  less  con¬ 
fusing  public  image  of  Statistics  by  merging  the  Royal  Statistical  Society  and  the  Institute 
of  Statisticians.  Statistical  Culture  is  the  study  of  how  statisticians  of  various  levels  can 
successfully  merge. 

If  we  want  to  successfully  achieve  “Viva  Statistical  Science”  is  it  a  prerequisite  to  also 
successfully  achelve  “Viva  Statistical  Culture”?  I  believe  that  the  answer  is  an  unequivocal 
yes  if  we  take  as  our  motto  uAlwuys  remember...  Statistics  is  Fun”  (where  fun  can  have  one 
or  more  of  the  meanings:  fun  (elegant),  functional  (useful),  functional  (abstract  analysis), 
function  (graphical),  function  (estimation),  fundamental). 
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PROCLAMATION 


City  of  College  Station 

WHEREAS,  there  is  no  future  without  statistics, 

WHEREAS,  the  future  of  our  nation  requires  every  citizen  to  have  statistical  maturity  to 
understand  and  implement  decisions  inevitably  based  on  the  analysis  of  data, 

WHEREAS,  students  planning  careers  should  be  made  aware  of  the  importance,  relevance, 
and  beauty  of  statistical  science, 

WHEREAS,  to  help  accomplish  the  above  goals  the  week  of  April  23rd  -  20th  has  been 
proclaimed  National  Science  and  Technology  Week  and  Mathematics  Awareness  Week, 

WHEREAS,  to  stimulate  awareness  of  statistics  as  a  discipline  at  the  interface  of  science 
and  mathematics,  the  Statistics  Department  of  Texas  A&M  University  is  organizing  a 
program  for  Statistical  Science  Awareness  Day  on  April  21, 1089, 

NOW  THEREFORE  I,  Larry  J.  Ringer,  Mayor  of  the  City  of  College  Station,  do  hereby 
proclaim  April  21,  1989  as: 

“STATISTICAL  SCIENCE  AWARENESS  DAY” 

in  College  Station,  Texas,  and  urge  all  citizens  to  study  the  proposition  that  quality 
of  life  in  the  high  tech  world  of  the  future  requires  each  person  to  have  some  level  of 
statistical  maturity. 

PASSED  AND  APPROVED  THIS  THE  13th  DAY  OF  April,  1989. 
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O.  Preliminaries 

In  this  report,  some  techniques  for  studying  random  mappings  and  related  problems  are  dis- 
cribed.  This  summary  concentrates  primarily  on  methodology  developed  by  the  author.  Conse¬ 
quently,  the  work  of  other  scientists,  active  in  this  area,  will  not  receive  extensive  treatment  in  this 
report.  A  nonnograph  is  in  preparation,  which  will  give  substantial  treatment  of  the  history  of  the 
subject  and  an  extensive  bibliography. 

The  present  report  will  concentrate  on  two  methods  used  by  the  author  to  obtain  result  in  the 
theory  of  random  mappings. 

The  first  of  these  is  the  use  of  classical  combinatorial  enumeration  methods.  The  second  ap¬ 
proach  is  the  use  of  a  “composition  theorem"  to  construct  generating  functions.  The  later  technique 
has  wide  generality,  leading  to  many  distinct  results  upon  specialization  of  the  parameters. 

1.  Introduction. 

Let  Xn  be  a  finite  set  with  |XA|  ■  n  and  let  Tn  be  the  set  of  all  mappings  of  Xn  into  Xn.  If 
a,  fi  €  rn,  then  define  ( a  •  0)  ( z)  -  a(  (3(  x) )  for  every  x  t  X*.  With  no  loss  of  generality,  we  can 
take  Xn  m  {1 , 2 n}.  (It  will  be  convenient  to  introduce  some  exceptions  later,  for  which  the 
choice  Xn  -  (0 , 1 , . . . ,  n}  has  some  minor  advantages).  Clearly  \Tn\  -  rtf*. 

Let  Pt%  be  a  probability  measure  on  the  subsets  of  TV  Various  mathematical  models  are  obtained 
by  appropriate  choice  of  Ppn ,  When  there  is  no  risk  of  ambiguity,  the  measure  will  be  denoted  by 

P. 

2.  Representations  of  the  mappings. 

In  this-section-we-introduee-two-additional -representations  for  a  mapping  aely,  which  are 
useful  in  many  applications. 

First,  there  is  a  one-to-one  correspondence  between  a  class  of  labelled  directed  graphs  Gn, 
known  as  functional  diagraphs,  and  r„,  the  set  of  mappings  of  Xn  into  Xn.  This  can  be  demon¬ 
strated  as  follows.  Fix  a«r„  and  let  xtXn,  The  if  <*(  ®)  *  y,  draw  the  directed  edge  from  x  to  y. 
Such  a  graph  will  have  vertex  set  Xn  and  have  exactly  one  edge  emanating  from  each  vertex.  These 
graphs  are  in  fact  characterized  by  that  property.  Similarly,  if  a  labelled  graph  whose  vertex  set  in 
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Xn  is  given  for  which  exactly  one  edge  emanates  from  each  vertex,  define  a(  x)  as  the  terminus 
of  the  edge  leaving  x  for  each  xeXn.  Because  of  this  Isomorphism,  we  will  identify  each  mapping 
with  its  corresponding  graph  and  employ  the  same  notation  and  terminology  for  both. 

Another  representation  may  be  constructed  as  follows.  Let  An  be  an  n  x  n  matrix  constructed 
as  follows.  If  a(»)  ■  ;,  then  let  ay  -  1,  otherwise  let  ay  -  0,  Such  a  matrix  has  exactly  one 
“one"  in  each  row.  Also,  assume  there  is  an  n  x  n  matrix  of  “zeros"  and  “ones"  with  exactly  one 
“one"  in  each  row.  Then,  if  the  "one"  in  row  t  is  in  column  set  a(  i)  1 , 2 , . . . ,  n 

The  three  representations,  the  mapping,  the  directed  graph  and  the  matrix  can  be  used  inter* 
changeable 

3.  Properties  of  Mappings* 

Let  aer„  be  a  fixed  mappping.  For  every  xeXH  define  x0  ■  x,xi  ■  a(x),xj  ■  a(xi)  ■ 

ckj(x) . That  is,  in  general  let  xm+t  ■  a(xm)  ■  aw(xi)  -  am+1(*o),  for  all  m  £  0. 

If  for  some  m  £  0 ,  am(  x)  ■  y,  then  y  is  the  mth  image  of  x;  the  set 

Sa(x)  ■  {x0,Xi,...} 

is  the  set  of  successors  of  x  under  a. 

If  for  some  m  <$  0 ,  am(  x)  ■  y ,  then  y  is  a  mth  inverse  of  x  under  a.  In  general,  am(  x) ,  m  < 
0 ,  may  not  exist  or  may  not  be  unique. 

Let 

Pa(x)  «U  {am(x)}; 

Pa(  x)  is  called  the  set  of  predecessors  of  x. 

If  there  exists  an  m  >  0  such  that  am(  x)  ■  x,  then  x  is  said  to  be  a  cyclic  element  under  at 
and  the  set 

Ca(x)  -  (x,a(x)  ,a2(x), ...  tam"l(x)} 

is  the  cycle  containing  x.  The  least  such  m  is  the  length  of  the  cycle  containing  x.  If  x  is  not  cyclic, 
define  C«(  x)  =  <i>.  The  set  of  cyclic  points  under  a  is  C*  Ca(  x) , 
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If  there  is  an  r  >  0  and  an  s  >  0  such  that 

ar(x)-«*(y), 

then  x  and  y  ate  equivalent  under  a.  It  is  easy  to  see  that  this  is  an  equivalence  relation  and 
the  equivalence  class  containing  x,  Ka(  x) ,  is  called  the  component  containg  x.  This  equivalence 
relationship  decomposes  Xn  into  equivalence  classes,  which  are  called  the  components  of  Xn  under 
a.  If  Xn  *  Ka(  *) .  then  ct  is  said  to  be  connected  (mote  precisely,  the  graph  of  a,  Ga,  is  connected). 
Also  it  is  easy  to  see  that  each  component  has  exactly  one  cycle. 

Fix  x  and  consider  the  set  {x,  a(  x) ,  a2  ( x) , . . .}.  Since  xtXn  and  |Xn|  *  n,  this  set  can  have 
at  most  n  distinct  elements.  Hence  there  are  r  £  0,  a  >  0  such  that  etr(x)  *  .  The  set 

{ar(  x) ,  ar* 1  ( x) , . . . ,  af+f“l  ( x)  >  is  the  cycle  in  the  component  fC«(  x) . 

A  vertex  xeXn  is  said  to  be  of  height  m  under  a  if  m  is  the  least  non-negative  integer  such  that 
am(  x)  is  cyclic.  The  set  of  vertices  of  height  m  is  called  the  mth-stratum  of  o,  Sm,«.  Also,  the 
height  of  a  is  defined  as 

Ha  »  max{S'm,a  ¥  <f>}' 

Note  that  So, a  is  the  set  of  vertices  cyclic  under  x,Ca. 

The  restriction  of  a  to  C«  defines  a  mapping,  which  we  call  the  permutation  induced  by  a.  This 
mapping,  denoted  by  a*,  is  a  permutation  on  a  subset  of  Xn  of  cardinality  |C«| . 

Finally,  we  introduce  the  notion  of  the  order  of  an  element  aeTn .  Consider  the  set  of  distinct 
elements  in  {a,  a2,.. The  cardinality  of  this  set  is  the  order  of  ct.  If  a  is  a  permutation,  this 
reduces  to  the  usual  definition  of  the  order  of  elements  in  a  group.  We  denote  this  by  0(  a)  and  it 
is  well-known  that 

0(  a)  -  0(  a*)  +  max  ( 0 ,  H„  -  1) . 

4.  Mathematical  Models, 

In  this  section,  we  provide  illustrations  of  some  of  the  commonly  employed  choices  of  PTt  and 
the  mathematical  structures  that  they  describe. 
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1.  Let  P{a(i)  *  ; '}  »  1,»  «  1 ,2 and  let  the  random  variables  a(i)  be  mutually 
independent  The  Prt  is  the  measure  which  assigns  probability  tt*  to  each  mapping  in  Tn. 
We  will  refer  to  this  as  the  symmetric  case. 

2.  Let  P{a(  i)  »  ;'}  *  if y  i,  P{a(  i)  *  »'}  *  0 ,  and  let  a( i)  be  mutually  independent 
ramdom  variables.  Then  Pr,  is  the  measure  which  assigns  the  uniform  probability  distribu¬ 
tion  over  all  mappings  with  no  fixed  points. 

3.  Let  P{a}  ■  n! if  a  maps  Xn  onto  and  0  otherwise.  Then  Pj*.  is  the  uniform  measure 
over  the  set  of  permutations  on  Xn. 

4.  Let  P{a(0  ■  1}  *  p  and  P{a(t)  ■  ;'}  *  jjff ,  for;'  1.  Also,  let  a(i)  be  independent 

random  variables.  Then  if  p  >  £ ,  the  set  of  mappings  is  known  as  mappings  with  an  attracting 
center.  If  p  <  these  are  referred  to  as  mappings  with  a  repulsing  center. 

Other  assignments  lead  to  random  rooted  labelled  trees,  forests  of  random  rooted  labelled  trees, 
random  connected  mappings,  and  so  forth. 

In  the  sequel,  we  restrict  to  the  symmetric  case.  The  other  cases  will  be  treated  in  the  more 
extensive  manuscript,  which  is  in  preparadon. 

5.  Probability  Distributionns  for  the  Symmetric  Model. 

For  this  case,  Pr,  •  rrn  for  every  mapping  cuTn.  We  first  establish  theorem  1. 

Theorem  1. 

^{|sa(*)|  - 

1  «£  ;  < k  <,n,  where  La(  x )  is  the  cycle  in  fC«(  x) ; 


(5.1) 

(5.2) 

(5.3) 
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also 

E\La( a)  |  «  J?{|<S!ei(  ®)  |  +  1  }/2 . 

Proof.  Since  the  probability  that  a( i)  *  jj  ■  l,2,...,n,t  ■  1,2,., 
images  of  each  element  t  are  independent  random  variables,  we  have: 


(5.4) 

. ,  n  is  n”1  and  the 


P{|S*(a:)!  -  A:,|La(*)|  ■;'}  - 

P(ar(*)  y  x,a(®),...,oir“,(x),0  <  r  <,  k  -  l,afc(x)  ■  a*'^®)} 

<»-*)!  ’ 

verifying  (5.1);  (5.3)  follows  trivially.  To  establish  (5.2),  one  need  only  sum  (5.1)  over;, 
1  ^  j  <;  k.  To  establish  (5.4),  note  that  P|La(s)|  |  |Sa(s)|  *  k}  »  ty-,  therefore, 


1 1^«)> -*}-*{ 


|5a(x)|+l 


}■ 


The  following  theorem  will  be  repeatedly  employed. 

Theorem  2.  The  joint  distribution  of  |So(  a)  U^i  ( o)  | , . . . ,  |S»-i  ( a)  |  is  given  by 


P{|5o(ck)|  -  no,|5i(ot)|  -  m,.«t|S*.i(a)|  -  rv_i} 


n! 


nolni!  ...n*_j! 


■no!  r$r% 


m 


„>v-i 

'  V-2  n  ' 


(5.5) 


where 

tv-1 

23^  "  n> 

<■0 

Proof.  |Sb(«)|  ■  » if  and  only  if  a  is  one-to-one  and  onto;  hence  we  obtain  n!rrn,  which 
coincides  with  (5.5)  when  |So(a)  I  “  n. 

Otherwise,  assume  |So(a)|  <  tv  Then,  is  the  number  of  ways  of  partitioning  Xn 

among  the  various  strata.  The  m  elements  in  So  ( a)  can  be  permuted  in  no !  ways.  Next  for  each 
stratum  S<(a),  with  |S<(a)|  ■  n,  ,  there  are  n^*1  ways  for  the  n,*i  elements  in  5<+i(q()  to  have 
images  in  Si(a). 
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Remark.  If  some  stratum,  say  5,(  a)  ■  <f>,  then  (5.5)  ■  0  unless  &+ 1  ( a)  ■  •  •  •  ■  (a)  »  <j>. 

Theorem  2  is  basic  to  many  items  in  the  sequel.  Marginal  distributions  available  from  Theorem 
2  are  the  distributions  of  the  number  of  cyclic  points,  the  distribution  of  the  height  of  the  mapping, 
the  distributions  of  the  number  of  elements  in  each  stratum  and  the  order  of  the  mapping.  The 
following  lemma  will  prove  useful  in  many  application  of  Theorem  2. 

Lemma  1.  For  all  complex  z  and  arbitrary  positive  itegers  q, 

»<*+«>*-'-£  £  :  jV’ft-fcl.  (J.«> 

m-1  ti+'+U-t 
^t  i  •  •  •  i  In  ^  1 


Proof.  If  q  m  l ,  the  conclusion  holds  trivially.  Therefore,  assume  that  it  holds  for  1 , 2 , . . . , 
«-  1,9?.  2.  Now 


q-  l 


\ 


«(»  +  «)•-'-»£ 

ti«l  ^  l\  —  1  J 


4-1 

**  +  E , 

<i*i  \  1 1 


Since  1  £q-  l\  <,  q  -  1 ,  the  induction  hypothesis  applies  and  we  get 


z(z  +  q) 


«-l  a 


4-1 


"£  £  j^=r^r»-<r 


l  i|  j  m-1  h* 


iS  S  <!  +  ...+ uft«4-l|  h  *  •  •  *  *m+ 1  * 


->•*£  £ 

W-2  <!+•♦<«-»  M'  •••iwi 


<1 . <14^1 

Since  is  the  term  for  Af  »  1 ,  the  induction  is  complete  and  (5.6)  is  established. 
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We  now  have: 

Theorem  3.  The  distribution  of  the  number  of  cyclical  elements  \Ca\  is  given  by 


Proof.  From  Theorem  2, 


P{\C'\-j)-P{\So(at)\-j} 
k  n! 


(5.7) 


^  V t n  |  nTT^1^  *••**&  n“*' 

;!n|!  ...nn.il 

the  sum  running  over  all  partitions  of  n  -  We  rewrite  this  expression  in  terms  of  non-empty 
partitions  obtaining 


P{|C„|  - -  E  - 


n! 


;'!ni!  ...nj 


Ifn?  . . .  rCT_1n“n, 


(5.8) 


m 

the  sum  running  over  m  *  2>  1  with)^  n<  »n-j.  Acomparison 

of  (5.8)  with  (5.6)  show  that  this  is  related  to  (5.6)  with  q  replaced  by  n  -  j  obtaining 


F{|Ca|-;'} 


n!  ;'(;  +  n-  ;l)fV~/~t 


(n-;')l 


n" 


establishing  Theorem  3. 

Remark.  Note  that  (5.7)  and  (5.2)  are  identical.  There  does  not  appear  to  be  an  obvious  expla¬ 
nation  for  this  coincidence. 

Theorem  4.  The  probability  distribution  of  |P«(  x)  |  is  given  by 


P{|P.(*)I  - 1)  -  '  1,2 . "• 


(5.9) 


Proof.  For  >  1 ,  let  X^\  be  ;  -  1  specified  elements  of  Xn\  we  can  designate  these  as 
x, ,  *2 , . . . ,  Xj- \ .  Let  x  be  a  distinguished  element  of  not  in  Xj-\ ,  Let  T\  be  the  set  of  mappings 
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a  in  Tn  with  a(Xn  -  ( u  {«}))  ■  X*  -  ( JCy_,  u  {a}) .  Define  Ti  as  those  mappings  a  with 
a(X/~i)  *  Xj„\  U  {a}  and  a*a<  *  a  for  some  k  >  0 .  Let  T*  ■  Tt  n  Tj ,  Then 

^{|-P«(®)|  -/}  -  ^  j  j  P{aer*},  for;  >  1, 
and 

P{a€r}-P{a«ri>P{aer2}. 

First,  we  have 

(  n-J  V‘ 

P{oi«Ti}  *  - 

\  n 

Therefore,  we  need  to  calculate  P{a«72 }.  This  is  accomplished  by  restricting  attention  to  Xj-\ 
and  defining  the  mapping  o'  satisfying  a'a<  -  aa<,  i  -  1 , 2 1  and  a'a  -  a.  That  is,  a'  is 
the  restriction  of  a  to  x\ , . . . ,  a/_i  and  a  becomes  a  fixed  point. 

Thus 


p{a«r2} 


-z 


0-1)1 
ml  • « •  fin* I 


lni  n?8 


(5.10) 


the  sum  running  over  all  non-empty  partitions  of  /  -  1.  From  lemma  one,  the  sum  in  (5.10)  can 


readily  be  evaluated,  obtaining 

,7-2 

and  hence 


establishing  (5,9),  for  j  >  1 . 


ji"  2 
vi~l 


If ;  ■  1,  then  {.£„•}  —  {a}  is  mapped  into  {X*}  —  {a};  there  are  (n—  l)"”1  such  mappings, 
which  also  yields  (5.9). 

Trivially,  we  have 
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Remark.  If  |Pa(x)|  *  n,  then  x  is  cyclic  and  a  is  connected  This  event  has  probability  rr1 
by  (5.9). 

Corollary  1. 


(5.11) 


6.  Asymptotic  Extimates  in  the  Symmetric  Case, 

We  now  obtain  the  asymptotic  ( n  oo)  probability  density  functions  of  \Sa(,  x)  | ,  |£a(  x)  \ ,  |Ca| . 

Accordingly,  we  establish  the  following  theorem. 

Theorem  5.  The  joint  asymptotic  (n  -»  oo)  probability  density  function  of  is 

given  by 

/(u,v)  ■  e“u^2,  0  <  v  £  u  <  oo,  (6.1) 

where  u  *  ^2^,  v  ■ 

The  asymptotic  ( n  -*  oo)  probability  density  function  of 

u  *  |Sa(*)|/\/t» 


/(u)  ■  ue"'*i/'2  ,  u  >  0, 

The  asymptotic  ( n  ->  oo)  probability  density  function  of 


(6.2) 


f(v)  -  v/27T(  1  -<t>(v)),  V  >  0, 


(6.3) 


where  O  ( v)  is  the  cumulative  distribution  function  of  the  standard  normal  distribution.  Specifically 


<t>(v)  -  f  (2ir)~be~**/2dx. 
J  -00 


JEcOOf.  In  (5.1)  let  k  ■  \/nu,  l  ■  \Jnv  and  replace  the  factorials  using  Stirling’s  formula.  This 
gives 

P{\SM\  *  v/nu,|.La(:r)|  ■  \/nv }  ~  ~  , 

(n- 

e-Vnu 

"  'n(l  -  J)L)  "-vW 

Expanding  log  ( 1  -  u/v^n)  is  a  power  series,  we  get 


/( u,v)-e“ul/2,  0  <  v  <|  u  <  oo. 


(6.2)  and  (6.3)  are  obtained  by  calculating  the  corresponding  marginal  distributions. 


From  these  asymptotic  relationships,  we  can  obtain  the  following  corollary. 
Corollary  1.  The  means  and  variances  are  given  by 


E{\L*(x)}~l(2iTn)tt 

£{|£»(x)|}~  ~(2irn)*, 
and 

£{|G,|}~i(2*n)*, 

7.  The  Composition  Theorem. 


frJ(|£«(x)|)  ~  n  §•  -  ft 


cr2(|Sa(z)!)~n 


2  -  f 


a2(|C«|)~n 


2 


(6.4) 

(6.5) 

(6.6) 


In  this  section,  we  give  an  abbreviated  treatment  of  the  composition  theorem.  An  extensive 
discussion  of  this  theorem  and  some  generalizations  of  it  will  be  treated  in  the  future  monograph. 

Let  S'*  be  the  symmetric  group  on  { 1 , 2 , . . . ,  k}  To'yeS’*,  we  can  associate  a  partition  {n ,  r2 , . . .  r*} , 

where  r<  is  the  number  of  cycles  of  length  r.  Clearly  XJir,  ■  k.  The  A; -tuple  {rj ,  . r*}  will 

be  referred  to  as  the  daaa  of  7.  A  subset  A f*  of  Si,  will  be  called  self-conjugate  if  and  only  if 
it  is  the  set  of  all  permutations  in  a  subset  of  the  possible  classes.  It  is  easily  seen  that  for  every 
\€Sk> A  A ■  Af*.  Now  let  Wk  be  given  self-conjugate  subsets  of  Sk  and  let  tu*  -  \Wk\,  wo  »  1 
and  let  w  denote  the  sequence  {u»*}“0 ,  Define 


*-0 


(7.1) 
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Lei  *  £  n  be  the  set  of  all  fi  -  <7x7“ 1  >  where  7  is  a  one-to-one  mapping  of  Xk  into  Xn 
and  rtWk.  Now  we  enumerate  the  set  of  a «r»  with 

1.  a'tWkn  for  some  k£n 

2.  cttT)n,  the  set  of  aiT%  of  height  £  j . 

The  number  of  such  mappings  otei;  will  be  decided  by  Vwjih  where  Vvjfi  -  1 .  Also,  we  denote 
by 

Theorem  6.  If 0  andO  £  /  £  n, 

V"J'*  '  S  **■  *f*  •  •  •  £1 .  (7.2) 

where  the  sum  runs  over  fco  +  *1  + ...  +  kf  -  n,  fc0 ,  fei ife,  £  0 . 

Proof-  This  is  an  immediate  consequence  of  Theorem  2.  The  following  corollary  is  often  very 
useful. 

Corollary  2.  Let  „  be  the  number  of  ottTn  with  a  W*,  k  fixed  and  l  £  Jfe  £  n. 

Then 


and 


E 


k-L 


V*,n 


n_  1 
*-*/ 


EtOOf.  The  proof  follows  readily  from  Lemma  1. 
We  now  define 


VwjiftCto » 1 1 1  • « •  1 1»)  *  y* 


nl 


Icq  I  I . .  •  kj 


...t? 


and 


a*  lot  1 1  •  •  • ,  £/) 


This  leads  to  Theorem  7. 


E  VwJ,n(tOi  fl  1 « • 

mO 


,tj)zn/n 1. 


(7.3) 

(7.4) 


(7.5) 


(7.6) 
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Thgorcml. 


x*Vo(*;to)  ■  *w(*to) 


and  for/  £  1 


^wj(  2*  to  t  >  fy)  *  (  ZJ  to>  tf , . . . ,  fy— 2  »  fy— 1***0  • 


(7.7) 

(7.8) 


Remark.  This  theorem  can  be  established  in  a  formal  algebraic  sense.  To  obtain  an  equivalent 
analytic  formula,  one  needs  to  restrict  to  |  at/- ie1^  |  <  e~l  andmax{|zto|,...||*Ol}  <  e"1.  Such 
details  are  omitted  here  but  are  essential  for  asymptotic  analysis. 

Proof.  Write 


*o  JV_ 


n! 


'Pi«j(*<*^0) $i i •  •  •$/)  *  52 ~r 52  X)  u  | L  t  I  | ( „  i 

7,1  4-0  «o  1  1  •  •  •  */-t  I  ( n  -  q) ! 

■W* tfr ( 1 1 ) k*  . . .  (  fc/-2t/-i ) k>~1  ( */-» t/)r 

-Ejf  E  »*toNM.>*'... <*,_>», -.)*-■£ $r#r"" 

*1 . */-i  HP4  <>n 

*  2  “T^-i  *«< *°  '  0-J » 0- 1***0  ■ 

<1-0  v1 


4-0 

Let  Ao(*o)  "  aro ,  At  ( , )  *  x0eM[  and  for;'  >;  2  let 

A/(zo,Zl,...,Z/)  *  A/-i(*0,*il...,*/_2l*/-ie'O. 


Theorcm-8.  For  >  1 ,  we  have 


A/(*o,*i,... -,*/>•*  ZQ«Al-'(z\)...,Zj)  (7.9) 

Proof.  The  conclusion  is  immediate  for  /  -  1.  Assume  that  the  result  is  valid  for;'  -  1,;'  ;>  2. 
Then 

A/(z0 , *t , •  •  • , */)  ■  A,_i( *o , *i , . . . , z/_a, Xj-xe*') 

-  z0  exp  { A/_2  ( ZQ ,  X\ , . . . ,  Z/-2 ,  z/_ I  e'O  > 

=  zo  exp{A/_i  ( Z)-\ ,  z/) }, 
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establishing  the  theorem. 

We  now  state  and  prove  the  composition  theorem. 

Theorem?. 

at  to  i  ti  i  •  *  *  t  t/)  —  ^w(  A/(  ato  i  zti  | . . .  |  ztj)  .  (7 , 10) 

Proof.  For  j  -  0  this  is  a  consequence  of  Theorem  7.  Also  from  Theorem  7, 

^u»j( a,  to, t 1 1  •  •  •  i t/)  *  ( a*  to i ti i  •  •  •  i t/—2 1 i/—  ief*0  • 

Therefore, 

VWl/-i  ( «;  to ,  t/,  •  •  < ,  t/- a ,  t/-ie*‘/)  ■  <M  Ay-t (at0l . . . ,  at/- 2 ,  at/- 1  «4,/) 

«  <M  A ,  ( 2t0 , . . . ,  at/-2 ,  at/-i ,  at/) ) . 

The  following  corollaries  can  now  be  easily  established. 

Corollary  3. 

^(a;t0 . tj)  *+  'A,(*t0 . at,))  (7.11) 

Corollary  4.  Let  A,  (a)  -  A/(z, a).  Then 

'Vwji.z)  m  *&m(A/(r))  ■  ^  Viujinzn/n\ ,  (7 ,12) 

tpQ 

where  Ao(  z)  «  z  and  for  j  ;>  1 

A,(r)  »  (7,13) 

.  Corollary.  3.  Let  K,,/,**  be  the  number  of  mappings  aeT«  with  a"cWkn ,  a eT,n  and  |C„|  *  k. 
Then 

"  ^(A/(at,a, ...,*)).  (7.14) 

mO  HmO  m 

The  composidon  theorem  provides  enumerating  formulas  for  mappings  satisfying  the  hypothe¬ 
ses  of  Theorem  6.  For  such  mappings  it  permits  enumeration  by  number  of  points,  number  of 
points  on  cycles,  number  of  points  in  each  stratum  and  so  on.  The  ability  to  choose  Wi,  provides 
the  generality  of  the  results.  Illustrations  follow. 
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Example  1  Let  W*  be  the  set  of  k  cycles,  then  the  set  of  mappings  considered  is  the  set  of 
connected  mappings. 

Example  2  If  Wh  be  the  identity  mapping  for  k  ■  1  and  Wk  ■  <i>,  k  j  1 ,  the  set  of  mappings 
is  the  set  of  rooted  labelled  trees 
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HANDLING  UNCERTAINTY  IN  INPUT  TO  EXPECTED  VALUE  MODELS 

Mark  A.  Youngren 
Requirements  Directorate 
US  Army  Concepts  Analysis  Agency 
8120  Woodmont  Avenue 
Bethesda,  Maryland  20814-2797 


ABSTRACT.  Due  to  the  large  number  of  entities  and  processes  that  must  be  represented,  combat 
models  at  tile  theater  level  in  the  Army  today  are  expected  value  models.  An  expected  value  model  is 
deterministic  --  it  uses  the  expected  value  of  random  variables  as  inputs  and  generally  uses  some  sort 
of  expected  value  within  the  internal  processes.  The  use  of  expected  value  models  creates  problems  in 
the  proper  interpretation  of  their  output  and  ways  for  representing  the  uncertainty  associated  with 
the  model  input  and  processes. 

This  paper  suggests  a  method  for  handling  uncertainty  in  the  input  data  sets  (which  usually 
contain  elements  that  arc  specific  realizations  of  random  processes)  in  situations  where  the  outcomes 
of  interest  can  be  expressed  in  binary  variables  (e.g.,  “success”  or  “failure”).  A  theater  nuch-nr 
exchange  is  used  as  an  example,  having  many  different  possible  outcomes  determined  by  random 
processes.  A  method  is  provided  for  describing  the  space  of  all  possible  outcomes  of  the  exchange  and 
partitioning  the  space  into  sets  of  outcomes  which,  if  used  as  input  into  a  theater-level  conventional 
simulation,  are  expected  to  lead  to  significantly  different  results.  A  method  for  sampling  the  most 
probable  outcome  from  each  set  is  also  explained. 

This  approach  permits  the  construction  of  an  experimental  plan  that  requires  a  small  number  of 
model  runs,  each  run  expected  to  provide  a  significantly  different  result.  From  these  runs  an 
estimate  of  the  variability  in  the  theater  combat  resulting  from  uncertainty  in  the  input  data  (in 
this  case,  the  impact  of  a  nuclear  exchange)  can  be  made. 

1.  Introduction.  Modeling  large  systems  and  processes  such  ns  combat  at  the  theater  level  is  difficult. 
The  number  of  possible  units  and  interactions  has  driven  most  modelers  to  tise  an  r.rpretcd  rain, 
approach.  An  expected  value  model  uses  the  expected  value  of  random  variables  as  input-  ami 
generally  uses  some  sort  of  expected  value  within  the  internal  processes.  The  models  nr*' 
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deterministic-,  that  is,  they  will  yield  only  one  set  of  outputs  for  any  given  set  of  inputs.  The  use  of 
expected  value  models  creates  problems  in  the  proper  interpretation  of  their  output  and  ways  for 
representing  the  uncertainty  associated  with  the  model  input  and  processes.  In  a  recent  discussion 
paper,  Stockton  [1989]  provided  the  following  example: 

“A  Red  unit  will  go  northwest  or  northeast  based  on  whether  his  strength  at  a  given  point  is 
above  or  below  some  threshold  value.  Let's  say  that  the  real-world  probability  of  being  above  the 
threshold  is  0.0  and,  if  above,  he  will  go  northwest  to  face  a  very  strong  Blue  force  armed  with 
Supertank.  If  he  goes  northeast  (probability  0.4),  he  faces  a  relatively  weaker  force,  armed  with  bows 
and  arrows.  With  several  replications  of  a  stochastic  model,  expected  losses  will  consider  both 
possibilities  and  will  develop  expenditures  of  tank  ammo  and  arrows;  with  an  expected  value  model, 
lie  will  always  go  toward  the  stronger  force,  and  no  expenditures  of  arrows  will  be  observed." 

Stockton  correctly  points  out  that  the  results  of  an  expected  value  model,  even  when  provided 
expected  value  Inputs,  are  the  expected  valuo  of  the  output.  He  suggests  that  the  output,  of  such 
a  model  may  be  a  "most  likely  value,’’  using  Ills  example.  However,  we  can  offer  another  example 
which  illustrates  that  expected  value  models  also  fail  to  provide  a  "most  likely"  result. 

Suppose  In  the  example  provided  above  that  the  Red  force  lias  a  visual  sensor  that  can  sec  all  of 
the  Blue  forces  traveling  together  (with  probability  l)  if  the  skies  are  dear,  and  cannot  son  any  uf 
the  Blue  force  if  the  skies  are  cloudy.  To  simplify,  suppose  that  the  skies  are  either  dear  or  cloudy, 
and  the  probability  that  the  skies  are  clear  is  0,0.  How  many  Blue  milts  are  detected  by  the  lied 
force?  The  expected  value  Is  0.0  ■  (100  percent  of  tho  Blue  units)  4-  0.4  •  (0  percent  of  the  Blue 
units)  =s  00  percent  of  the  Blue  units.  Expected  value  models  will  normally  apply  expected  values, 
either  as  inputs  to  the  model  (00  percent,  would  be  an  expected  value  for  the  probability  of  target 
acquisition)  or  internal  to  the  processes.  Note,  however,  that  acquiring  111)  percent  of  the  Blue  I'mve  i- 
llie  least  likely  outcome,  as  it  occurs  with  probability  ()!  Even  if  wo  chose  the  most  likely  result  of 
100  percent  detection  (which  Is  not  the  way  that  expected  value  models  generally  handle  continuous 
variables  as  opposed  to  choices),  we  run  into  problems. 

Now  let  us  combine  the  two  examples.  It  is  reasonable  to  suppose  that  if  the  Red  force  can  see 
the  Blue  force,  or  even  a  large  percentage  of  the  force,  it  will  notice  that  one  force  Is  armed  with 
Supertank  and  the  other  with  bows  and  arrows.  Thus,  given  detection,  it  will  engage  the  weaker 
(bows  and  arrows)  force.  If  we  have  the  model  take  the  most  likely  values  in  the  two  examples,  it 
will  (1)  detect  L 00  percent  of  the  Blue  force  and  (2)  go  northwest,  to  engage  the  Blue  force.  Kadi 
result  is  by  itself  most  likely,  vet.  the  result  is  the  most,  unlikely.  Even  if  one  modeled  the  Bed  furre 
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detection  at  CO  percent,  the  combination  of  a  00  percent  detection  (still  sufficient  to  distinguish 
between  Supertank  and  bows  and  arrows)  and  moving  northwest  is  unlike'y. 

Admittedly,  these  examples  are  simplistic.  Yet  it  is  true  that  expected  value  models  not  only  fail 
to  yield  the  expected  value  of  the  output,  they  also  fall  to  yield  the  most  likely  output.  What,  then, 
is  the  probability  associated  with  the  output  of  an  expected  value  model?  The  answer  to  that, 
question,  unfortunately,  is  “nobody  knows."  This  is  why  expected  value  models  can  yield 
counterintuitive,  contradictory,  and/or  nonsensical  results  when  initially  tested.  The  usual  approach 
when  this  occurs  is  to  adjust  input  data,  processes,  thresholds,  etc.  until  the  model  yields 
“reasonable'’  results.  Hopefully  tills  yields  a  model  that  will  provide  suitably  realistic  results  with  n 
different  input  data  set,  but  there  are  no  guarantees.  We  unquestionably  have  no  way  of  determining 
the  likelihood  of  any  given  output  from  a  complex  expected  value  model. 

2.  Sources  of  Uncertainty.  There  are  two  areas  of  uncertainty  properly  associated  with  an  expected 
value  model  that  must  be  handled:  uncertainty  in  the  model  input,  and  uncertainty  in  the  model 
processes. 

Unfortunately,  a  “blessed"  input  data  set  is  often  regarded  as  certain  -  if  we  have  approval  lor  a 
set  of  numbers  to  be  used  in  the  study,  then  those  numbers  are  tkg  set' to  use  to  support,  our 
analysis.  Excursions  from  the  base  data  set  for  purposes  of  analysis  will  vary  only  a  sinuii  number  of 
data  items  by  design!  the  others  remain  fixed.  Some  input  data  values  are  truly  fixed:  the  air 
distance  from  Bremen  to  Munich  is  an  example.  Other  values  may  be  fixed  by  scenario;  for  example, 
the  daylight  hours  vary  by  latitude  and  time  of  year;  a  scenario  will  fix  a  time  and  place  that,  will  in 
turn  determine  the  appropriate  value  for  daylight.  Unfortunately,  these  scenario-driven  items  are 
often  fixed  arbitrarily,  even  when  they  may  have  an  impact,  upon  (lie  analysis.  For  example,  if  a 
force  is  particularly  vulnerable  to  detection  by  n  sensor  that  requires  daylight ,  you  ran  uei  different 
results  in  a  summer  versus  winter  scenario  (which  will  in  turn  be  different  than  that,  obtained  u.-inu 
an  arbitrary  number  like  d  hours  or  12  hours).  Tltis  difference  may  even  be  apparent  in  studies  that 
seemingly  are  not  associated  with  detection  -  ammo  rates  could  be  significantly  different,  for 
example.  This  is  a  simple,  obvious  example;  many  others,  not  so  easily  identified,  exist.  \Ye  must 
regard  the  inout  data  set  as  a  single  realization  of  many  stochastic  variables.  It  is  not  always  clear 
which  realiza'  I  n  to  select  for  use  --  averages  do  not  always  exist  and  may  no'  be  appropriate. 
Furthermore,  correlations  exist  between  sets  of  those  data  inputs;  for  example,  selecting  the  most 
likely  or  expected  values  of  cloud  cover  and  rain  independently  may  yield  the  combination  of  siiniiv 
with  1  inch  of  rain!  .Vote  that  this  problem  exists  with  stochastic  , 'Monte  Carlo)  models  --  they  al.o 
require  a  fixed  data  set  t hat  is  tun  varied  from  run  to  run. 
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Uncertainty  also  exists  in  the  model  processes.  Stochastic  models  generally  handle  this 
uncertainty  through  random  number  draws,  although  they  are  also  subject  to  problems  associated 
with  correlations  (separate  random  number  draws  generally  require  independence)  and  fixed  values 
such  as  thresholds.  The  examples  provided  above  illustrate  some  of  the  problems  associated  with 
handling  process  and  input  uncertainty  within  an  expected  value  model. 

3.  Addressing  Uncertainty  in  Expected  Value  Models.  At  this  point,  it  would  be  nice  to  be  able  to 
make  a  statement  like  “the  solution  to  this  problem  Is  easy;  one  simply  needs  to...  .”  Unfortunately, 
there  are  no  simple,  universal  solutions  to  the  problems  associated  with  addressing  uncertainty  In 
expected  value  models.  It  is  clear,  however,  that  any  methods  that  might  alleviate  the  problem  must 
deal  with  the  uncertainty  associated  with  the  data  input  as  well  as  the  uncertainty  associated  with 
the  model  processes.  Furthermore,  the  uncertainty  in  the  input  data  justifies  the  following  assertion: 
executing  an  expected  value  model  only  once  for  a  given  data  set  docs  not  provide  a  meaningful 
result,  If  an  expected  value  model  is  to  be  used  to  support  analysis,  the  user  must  be  prepared  to 
execute  multiple  runs,  varying  in  some  meaningful  fashion  the  input  data  and/or  the  model 
processes,  in  order  to  establish  some  measure  of  the  uncertainty  associated  with  the  output  of  such  a 
model. 

Ideally,  such  an  approach  will  minimize  the  number  of  runs  required  (because  running  a  large 
expected  value  model  can  be  very  costly),  yet  provide  a  significantly  different  result  from  each  run, 
thus  increasing  the  variance  across  all  outputs.  We  want  to  be  able  to  describe  the  probability  thin 
the  conditions  represented  in  the  input  for  each  run  (or  conditions  similar  to  those  represented)  will 
occur. 


We  have  developed  an  approach  to  handling  input  uncertainty  in  theater-level  expected  value 
models  in  situations  when  the  out  conies  of  interest  can  be  expressed  in  terms  of  binary  variables; 
i.e.,  one  can  describe  all  events  as  “yes'*  or  “no."  "on"  or  "off."  etc.  The  particular  application  t hit i 
will  be  developed  deals  with  a  theater-level  tactical  nuclear  exchange. 


Several  models  of  conventional  warfare  exist  at  the  theater  level.  The  model  used  at  CAA  is 
called  the  Force  Evaluation  Model  (FORCEM).  Like  most  theater-level  models  and  scenarios, 
FORCEM  is  a  low  resolution  expected  value  model,  representing  combat  forces  at  the  division  and 
higher  level  and  time  in  12-hour  steps.  The  Nuclear  Effects  Model  Embedded  Stochastically  in 
Simulation  (NEMESIS)  researcli  at  CAA  (Youngren  [ I !)80] )  documents  an  analytic  model  for 
describing  the  possible  outcomes  of  a  then  ter-  level  tactical  unclear  exchange.  The  methodology 
described  in  this  paper  arose  from  the  need  to  summarize  the  stochastic  outcomes  of  the  theater-level 
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exchange  as  input  to  FORCEM. 


4.  The  Scenario.  In  a  theater-level  battle  where  nuclear  weapons  may  be  employed,  the  commander 
of  the  forces  on  a  side  may  have  an  overall  objective  (such  as  stabilizing  the  forward  line  of  own 
troops  (FLOT)  in  the  defense  or  achieving  a  breakthrough  in  the  offense)  that  will  necessitate  the 
use  of  nuclear  weapons.  In  order  to  meet  this  objective,  the  commander  will  specify  the  defeat 
criteria  against  each  unit  -  that  is,  the  necessary  degree  of  damage  to  be  achieved  against  each  unit 
to  meet  his  objective.  The  defeat  criteria  will  differ  from  unit  to  unit  depending  upon  the  unit 
mission,  the  posture,  the  equipment,  etc,  The  criteria  applied  to  larger  units  (such  as  divisions)  will 
frequently  focus  fires  on  critical  subordinate  units.  For  example,  the  defeat  criteria  for  a  unit  might 
be  achieving  a  latent  lethal  dose  (about  450  rad)  against  at  least  50  percent  of  the  personnel  In  the 
unit.  The  defeat  criteria  for  a  particular  division  might  be  to  defeat  at  least  50  percent  of  the 
infantry  units  oral  least  40  percent  of  the  armor  units  in  the  division. 

Although  the  effects  of  a  tactical  nuclear  laydown  at  the  theater  perspective  are  normally 
described  in  terms  of  defeating  divisions,  tactical  nuclear  weapons  within  the  theater  are  targeted 
against  forces  at  the  company  and  battery  level,  The  term  subunit  (also  target  or  target  subunit) 
used  in  this  paper  denotes  a  combat  organization  (such  as  a  company)  that  would  be  targeted  by  n 
nuclear  weapon.  The  size  of  the  subunit  will  depend  both  upon  the  capabilities  of  the  weapon  system 
used  to  engage  the  subunit  and  the  targeting  doctrino  of  the  flrer.  For  example,  companies  may  be 
targeted  close  to  the  FLOT  using  small,  artillery-fired  weapons,  while  battalions  may  be  targeted 
deep  using  missiles  or  air-delivered  weapons.  For  purposes  of  exposition,  we  will  refer  to  the  low- 
resolution  combat  organizations  represented  in  theater  models  such  as  FORCEM  (usually  divisions, 
although  other  forces  may  be  represented  as  well)  as  units. 

There  are  very  many  largetable  subunits  in  a  typical  theater  scenario,  on  the  order  of  II)'1.  A.-  a 
result,  there  are  210  possible  outcomes  that  can  occur  in  terms  of  the  defeat  or  failure  to  defeat  each 
subunit,  Even  if  we  look  only  at  the  defeat  or  failure  to  defeat  the  low  resolution  aggregate  units 
represented  in  our  theater  model  (usually  several  hundred ),  we  still  have  on  the  order  of  2  0 
possible  outcomes.  Even  with  sophisticated  techniques  and  considerable  confounding,  classical 
experimental  design  approaches  require  at  least  one  run  per  variable,  The  large  amount  of  time  and 
effort  required  to  execute  even  a  simple  run  of  a  typical  theater-level  expected  value  model  prohibit 
more  than  a  few  model  runs  for  any  study.  Classical  experimental  designs  therefore  obviously  cannot 
be  applied.  Our  objective  is  to  construct  a  plan  that  minimizes  the  number  of  different  input  data 
sets  (thus  minimizing  the  number  of  theater-level  model  runs)  yet  fully  reflects  the  range  of  possible 
outcomes  of  the  theater  nuclear  exchange. 
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5.  A.  Method  for  Addressing  Input  Uncertainty  in  Expected  Value  Models.  Describing  the  outcome  of 
the  theater-level  nuclear  exchange  on  each  unit  in  terms  of  defeat  criteria  allows  us  to  define  a 
binary  variable  B(,  where  B(  =  1  if  the  unit  is  defeated;  0  otherwise.  Given  the  assumption  that  the 
outcome  is  independent  between  units,  the  outcome  of  any  exchange  is  simply  a  set  of  0’s  and  l’s 
with  the  probability  that  any  B(  =  1  equal  to  the  probability  that  unit  i  is  defeated,  i  = 

1 .  m.  Methods  for  easily  calculating  the  probability  of  defeat  for  each  targetable  subunit  are 

given  in  Youngren  [1989].  Given  m  units,  there  are  2m  possible  outcomes.  Clearly,  if  we  define  defeat 
criteria  in  terms  of  total  numbers  of  potential  nuclear  targets  (on  the  order  of  104),  there  are  too 
many  outcomes  to  enumerate. 

At  the  theater  level,  however,  defeat  criteria  can  usually  be  expressed  In  terms  of  divisions  and  a 
limited  number  of  other  high  value  targets  --  on  the  order  of  at  most  several  hundred  across  a 
theater,  Each  division,  in  turn,  will  have  its  defeat  criteria  established  in  terms  of  units  subordinate 
to  that  division.  For  example,  suppose  that  a  division  j  has  10  battalions  of  infantry  (engaged  as 
battalions),  24  armored  companies  (engaged  as  companies),  and  20  batteries  of  artillery,  The  defeat 
criteria  for  tills  division  may  be  50  percent  of  the  infantry,  40  percent  of  the  armor,  or  00  percent  of 
both,  with  a  separate  criteria  for  artillery  (divisional  and  nondivisional).  In  terms  of  maneuver 
subunits,  5  infantry  battalions  or  10  armor  companies  must  be  defeated  in  order  to  defeat  the 

division.  There  arc  , j  qt  (b'T—  q ) |  WfV^s  c*'oos'n8  P  infantry  battalions  and  q  armored 

battalions  for  defeat,  and  all  combinations  where  p  >  5,  q  >  10,  or  (  p  +  q  )  >  00  percent  uf  tlm 

subunit  (which  can  be  worked  out  for  specific  values  of  p  and  q  )  lend  to  the  defeat  of  this  division, 

if  we  assume  that  each  subunit  i,  i  =  1,  34  has  a  unique  probability  of  defeat  Pje/ett(( 0«  we 

probably  do  not  wish  to  enumerate  all  sets  of  subunits  where  the  division  is  defeated  and  compute 

the  joint  probability  (which  will  be  the  product  of  P,<eyertt ( 0  for  the  subunits  i  defeated  and 

(l  —  P.fe/eaii*))  f°r  *he  rjulmiiit si  that  are  not).  Fortunately,  this  situation  is  readily  amenable  to 

Monte  Carlo  solutions,  We  simply  need  to  draw  34  binary  pseudorandom  numbers  Hf  such  that  each 

number  B(  =  1  with  probability  piieyt#((ij,  and  let  a  binary  variable,  say  D„,  equal  1  if  the  set  of 

numbers  B(  drawn  correspond  to  division  j  being  defeated,  0  otherwise.  If  we  perform  N  replications 

of  this  experiment,  we  can  estimate  P[  division  defeated  ]  =  *  £  D«*  If  we  do  this  for  each  division 

*'  n«l 

j,  then  we  have  a  probability  P^ey8a((<h'u  j)  =  P[  division  j  defeated  ]  for  j  =  1,  ndiv,  where  ndiv 
as  the  number  of  divisions, 

At  the  division  level,  we  can  define  a  binary  variable  Oj  to  define  the  outcome  of  the  nuclear 

exchange  with  respect,  to  division  j,  j  =  1 .  ndiv,  O j  =  I  with  probability  p ,it/ral(dn  j)  if 

division  j  is  defeated;  0  otherwise. 
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Across  the  theater,  the  theater  commander  will  desire  at  least  a  certain  percentage  of  units  be 
defeated  in  order  for  the  employment  of  nuclear  weapons  to  be  considered  effective.  We  can  define  a 
binary  function  of  the  random  variables  Q,  <j>(  Q  ),  such  that  <f>(  Q  )  —  1  if  the  commander’s 
objective  is  met;  0  otherwise.  Clearly  <t>(  Q  )  is  nondecreasing  in  Q.  The  function  <t>  may  be  regarded 
as  identical  to  a  structure  function  of  a  coherent  system  In  reliability  theory  (Barlow  Proschan 
[1981]);  thus  we  can  use  results  from  coherent  structure  theory  in  our  analysis  of  the  nuclear 
exchange  issue. 

For  example,  If  any  It  out  of  tn  divisions  must  be  defeated  In  order  ft  ■  the  commander’s 
objective  to  be  met, 

*{  Q  )  =  ( ot  o3 ...  ofc ) il  ( Oi  o3 ...  0*.!  ok+1 ) li ...  Ji  ( om.k+[ ...  ), 

for  all  possible  subsets  of  size  k  from  the  m  units,  1  <  k  <  m,  where 
(  Xj  )li(  X(  )  5  1  -  (  1  -  X,  )(  1  -  X,  ). 


Furthermore,  we  can  bound  P[  <fi(  Q  )  =  1  ]  by  (Barlow  &c  Proschan  [1981]  p.  3 1  )i 

max  If  P[0,=il]  <  P[^(Q)  =  1]  <  ,  min  II  P[  0^  =  1  ]  , 

1  <  r  £  npath  1  '  J  1  1  <  5  <  ncut 

where  PP  denotes  one  of  the  n path  =(^  ™  )  possible  min  path  sets  (in  this  case,  a  min  path  se 


set  is 


any  set  of  k  units),  K*  denotes  one  of  the  ncut  ;.+  i  )  possible  min  cut  sets  (in  this  cast.',  u 

min  cut  set  is  any  set  of  m-Hl  units),  and  _[]_  X(  =1—77  ( 1  —  X , ) .  If  wo  lot.  p„(/l  = 

i  > 

P[  0(  »  1  ],  and  number  the  units  such  that  p0(  1)  <  p„(2)  <  •••  <  p o(m),  then 


.  max  ,  , 

1  <  r  <  npnili  ig 


m-k+t 

ir  P[  0,  =  1  ]  =  II  Pot');,  min  p[0,  =  l]  =  Jj_  p,(i)  . 

i£Pr  istm-k-H  1  <  a  <  ncut  ,-/ 


This  example  of  a  It  out  on  w  defeat  criteria  shows  how  wo  can  estimate  (through  bounds i  tin1 
probability  that,  the  commander's  objective  may  be  met.  Alternatively,  P[  o(  0  )  =  1  ]  can  be 
estimated  using  the  same  Monte  Carlo  technique  used  to  find  P[  Oj  —  1  ]  for  each  division  j. 


6.  Partitioning  the  Space  of  All  Possible  Outcomes.  At  the  theater  level  with  a  total  of  nl  division¬ 
sized  and  high  value  targets,  if  we  examine  the  nuclear  exchange  outcome  0 j  for  each  division  (or 
equivalent  high-value  target),  there  are  2ni  possible  outcomes.  It  may  be  the  case  that  it  makes  a 
difference  in  the  battle  that  follows  the  nuclear  exchange  which  units  are  defeated  or  targets 
destroyed  in  the  exchange.  Or,  more  simply,  it  may  be  how  many  units  are  defeated  and  targets 
destroyed  across  ‘lie  theater  which  makes  a  difference. 
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It  is  possible  to  define  sets  of  outcomes  of  the  nuclear  exchange  that,  given  our  best  judgment, 
we  expect  to  have  a  significantly  different  effect  on  any  subsequent  theater-level  battle  (if  all 
outcomes  have  approximately  the  same  effect,  then  there  is  one  set  consisting  of  all  outcomes),  We 
choose  these  sets  by  selecting  partitions  dividing  the  sample  space  (space  of  all  possible  outcomes) 
into  strata  such  that  the  following  properties  are  met: 

(1)  All  events  within  a  given  stratum  will  yield  approximately  the  same  overall  theater-level 
outcome.  As  a  result  of  this  assumption,  we  regard  all  events  within  any  given  stratum  ns 
exchangeable . 


(2)  Any  set  of  n  events  from  n  different  strata  are  expected  to  yield  n  different  theater-level 
outcomes.  Thus,  any  pair  of  events  from  two  different  strata  are  not  exchangeable. 

In  practice,  all  events  within  a  stratum  will  not  be  truly  exchangeable,  and  the  two  events  to 
cither  “side”  of  any  partition  will  likely  lead  to  similar  theater-level  outcomes.  Nevertheless,  it,  is 
possible  to  conceive  of  outcome  sets  with  different  results,  and  we  assume  for  all  of  the  development 
below  that  these  two  properties  are  obeyed. 


For  example,  suppose  that  there  are  20  opposing  divisions  in  a  sector  of  combat.  Our  best. 
Judgment,  given  the  tactical  and  operational  situation,  is  that  the  defeat  of  at  least  T  divisions  out  m' 
the  20  will  be  required  to  avoid  loss  of  territory  (stabilize  the  FLOT- which  may  be  the 
commander’s  objective).  However,  if  14  or  more  divisions  are  defeated,  an  opportunity  occurs  not 
merely  to  stabilize  the  FLOT  but  also  to  conduct  a  successful  counterattack.  In  this  case,  if  O,  —  i 
if  division  i  is  defeated,  i  a  1 . 20,  there  are  22u  possible  outcomes.  We  can  partition  the  sample 


space  of  possible  outcomes  into  the  £  (  2L()  )  outcomes  whore  0  or  fewer  divisions  are  defeated,  the 

fcsso'  K  ' 

T,  (  "i! J  )  outcomes  where  7  or  more  but  loss  than  l-l  divisions  are  defeated,  and  the  V  (  -;11  ) 
k  '  tSl-t'  k  > 


outcomes  where  14  or  more  divisions  are  defeated. 


The  example  given  above  involved  two  partitions  (three  strata);  the  number  of  partitions 
required  depends  on  the  number  of  significantly  different  theater-level  outcomes  that,  need  to  be 
represented.  Selecting  the  partitions  will  require  experienced  judgment  and  possibly  some 
experimentation  with  the  theater  model.  If  one  is  unsure  about  how  many  partitions  to  select,,  the 
number  of  strata  should  equal  the  maximum  number  of  theater  model  runs  you  can  afford, 
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7.  Stratified  Sampling  from  the  Sample  Space.  Once  the  sample  space  (space  of  all  possible 
outcomes)  has  been  identified,  it  is  possible  to  perform  a  stratified  sampling  from  the  sample  space, 
each  sample  from  the  outcome  of  the  nuclear  exchange  model  forming  an  input  vector  to  the 
theater-level  conventional  model.  From  each  stratum  created  by  our  partitions,  a  single  realization 
can  be  sampled.  A  random  sampling  approach  can  be  used;  however,  since  the  actual  likelihood  of 
all  of  the  events  within  a  stratum  may  vary  widely,  we  recommend  using  a  fixed  sampling  scheme, 
in  particular  sampling  the  mode  from  each  partition.  Given  the  assumption  of  exchangeability 
between  events  within  a  stratum,  any  choice  will  have  a  roughly  equivalent  effect  on  the  theater- 
level  outcome,  so  any  choice  is  valid.  Using  the  mode  allows  us  to  compensate  for  the  fact  that  the 
events  within  the  stratum  are  only  approximately  exchangeable.  A  modal  (most  likely)  outcome  will 
also  form  a  plausible  input  suitable  for  subsequent  analysis.  The  theater-level  conventional  model, 
such  as  FORCEM,  will  be  run  ns  times  for  each  of  the  ns  strata  created  from  ns— 1  partitions,  using 
the  outcome  selected  from  each  stratum  as  an  input.  If  the  second  assumption  that  we  made  in 
selecting  the  partitions  is  met,  the  ns  battles  simulated  in  FORCEM  using  outcomes  from  the  ns 
different  strata  should  yield  noticeably  different  results.  The  response  surface  estimated  using  these 
ns  FORCEM  runs  should  provide  a  better  representation  of  the  variability  possible  in  theater-level 
combat  where  nuclear  weapons  are  employed  than  a  random  selection  of  ns  outcomes  from  the  2nl 
outcomes  possible,  where  nt  is  the  number  of  targctablc  subunits  in  the  theater. 

The  question  naturally  arises,  “what  if  I  am  wrong  in  selecting  the  partitions?"  Partitioning  is  n 
judgmental  process;  more  of  an  art  than  a  science.  The  situation  in  which  this  technique  is  to  lie 
used  is  one  where  many  runs  of  the  deterministic  model  are  not  possible;  therefore,  it  is  not  possible 
to  sample  the  results  of  many  outputs  given  many  different  input  data  sets  describing  different 
nuclear  exchange  outcomes.  As  a  result,  we  simply  do  our  best  to  try  and  force  realizations  from 
areas  of  the  space  of  all  possible  outcomes  where  we  think  that  the  theater-level  outcome  will  be 
different.  The  impact  of  being  wrong  is  not  much  different,  than  being  right.  Wo  still  have  another 
point  in  the  theater-level  outcome  space  that  you  ore  sampling.  The  fact  that  the  nuclear  exchange 
outcome  did  noi  lead  to  the  theater-level  outcome  expected  should  be  of  great  Interest  to  the 
analysis.  Either  the  theater  model  has  deficiencies  in  correctly  representing  the  impact  of  the 
exchange,  or  the  theater  situation  is  (surprisingly)  robust  to  the  exchange.  If  the  theater  outcome 
that  you  tried  to  create  (by  selecting  the  nuclear  exchange  outcome  stratum)  is  still  of  interest, 
another  run  could  be  attempted  (if  time  and  resources  permit),  sampling  from  a  more  extreme  point 
within  the  stratum. 
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8,  Selecting  the  Moat  Likely  Outcome  (Mode)  From  Each  Stratum.  Selecting  the  mode  from  each 
stratum  is  simple  and  not  computationally  intensive.  The  partitions  defining  the  stratum  will 
establish  the  outcome  vectors  Q  that  fall  within  each  stratum.  Recall  that  p*(j)  =  P[  0^  =  1  ],  and 
let  q0(j)  =  1  -  poO').  Order  the  p9(j)  and  q0(j)’s  together  from  the  largest  to  the  smallest  value.  To 
select  the  mode  within  each  partition,  go  from  the  first  value  ( p<> ( j)  or  q0(j) )  and  select  the  outcome 
0 j  =  1  for  each  p 9{j)  and  the  outcome  Oj  =  0  for  each  q0(j).  Continue  until  each  target  j  has  an 

outcome  assigned,  making  sure  to  assign  only  one  outcome  to  each  target.  It  will  be  necessary  to 

“skip”  over  the  higher  probability  ( p„0')  or  q0(i) )  for  some  targets  j  In  order  to  have  a  total  set  of 
outcomes  fall  within  the  partition. 

This  procedure  can  most  easily  be  understood  through  an  example.  Suppose  we  have  five 
divisional  units  with  the  following  probabilities  of  defeat,  (  P[  Oj  =  1  ] ):  p0(l)  =  0.2,  p0(2)  =  0.25. 
p0(3)  =  p0(4)  =  0.4,  p„(5)  =  0.6.  We  also  have  the  following  strata  defined  in  terms  of  number  of 
units  defeated:  {  0,  l  },  {  2,  3,  4  },  and  {  5  }.  Wo  order  our  probabilities  as  follows:  q„(l)  =  0.8  > 

q0(2)  =  0.75  >  po(5)  =  q,(3)  =  q<,(4)  =  0.6  >  p,,(3)  =  p„(4)  =  q0(5)  =  0.4  >  p„(2)  =  0.25  > 

p0(l)  =  0.2. 

The  first  stratum  must  have  zero  or  one  unit  defeated.  Thus  our  mode  for  the  first  stratum  Is 
q9(l)-q«(2)-|)9(5)-q«,(3)*q9(4)  (i.e„  outcomes  0,=(),  O2=0,  Ob=  1.  On  =  0.  0.|=U),  with  a 
probability  equal  to  (0.8)(0.75)(0.d)a  =  0.1206.  The  second  stratum  must,  have  two.  throe,  or  four 
units  defeated  and  the  mode  is  q<,(l)-q„(2)*p<,(5)-q<,(3)‘P<>(4),  with  a  probability  equal  to 
(0.8)(0.75)(0.(3)2(0.4)  =  0.0864.  In  this  case,  we  “skipped”  outcome  0^=0  with  probability  0.0  and 
selected  outcome  0B  =  1  with  probability  0.4  so  that  we  would  have  at  least  2  units  defeated  for  this 
strata.  Note  that  an  equally  likely  selection  would  be  q®(  1  )*q«»(2) *p«(5) *qo(*l )*p<»(3).  The  third 
stratum  must  have  five  units  defeated  and  the  mode  is  |>0(o ) •  p,.(3) •  |»,9(-l) •  p..( 2 ) •  p.-. t  i ).  with  a 
probability  equal  to  (0.6)(0.4)~’(0.25)(0.2)  =  0.0048. 

9.  Interpreting  the  Results  of  Conventional  Runs  Using  Stratified  Inputs.  If  we  wish  to  obtain  an 
output  measure  from  the  theater-level  conventional  model  that  we  wish  to  average  across  all  possible 
outcomes  (which  is  the  sort  of  thing  we  normally  do  in  our  simulation  models),  we  need  to  construct 
a  weighted  average  from  the  ns  runs  conducted  using  the  theater  model.  The  weight  assigned  to  the 
output  measure  from  each  run  k  would  be  the  total  likelihood  of  all  events  within  stratum  k,  k  —  1. 
....  ,ns.  If  it  is  possible  to  enumerate  all  of  the  possible  outcomes  (n i  sufficiently  small),  this 
likelihood  can  lie  computed  directly.  If  nt  is  too  large,  we  can  conduct  a  simple  Monte  Carlo 
estimation  of  the  probability  pt  that,  an  event  chosen  at  random  falls  within  stratum  k.  k  =  1 . 
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ns.  This  is  the  straightforward  process  of  estimating  the  vector  {  pTl#  }  from  a  multinomial 

distribution. 

VVc  can  return  to  the  previous  example  to  illustrate  an  exact  computation  of  the  likelihood  of  all 
events  within  a  stratum.  Recall  that  the  strata  were  defined  in  terms  of  number  of  units  defeated: 
{  0,  1  },  {  2,  3,  4  },  and  {  5  }.  The  probability  that  0  units  are  defeated  is  P{  0  }  ~ 
q0(l)-q<>(2) ^,,(3) ‘qc.(4)>q0(5)  =  0.0864.  There  are  ^  )  =  5  possible  outcomes  leading  to  1  unit 

destroyed;  they  are: 

p„(l)'qo(2)-q0(3)-q0(4)'q0(6),  q„(  1)  •  p0(2)  <q0(3) -q„(4)  *q«,(5),  q0(  1)  >q0(  2)  •  p0(  3)  •  q0(4)  *c[„(5) , 

qo(l)'qo(2)*q0(3)«p0(4)-q0(5),  q0(  l)-q<,{2)-q<,(3)-q0(4)'po(5) 
with  a  total  probability  of  0,0216+0.0288+0.0576+0.0676+0.1296  =  0.2952.  Thus  the  total 
likelihood  of  the  events  in  the  first  stratum  is  0.0864  +  0.2952  =  0.3816. 

The  calculations  for  P{2},  P{3},  and  P{4}  are  messy  (more  combinations)  but  straightforward, 
The  likelihoods  are  P{2}  =  0.3612,  P{3}  =  0.2012,  and  P{4}  =  0.0512,  for  a  total  likelihood  of 
0.6136.  The  likelihood  of  the  third  stratum  is  P{5}  =  0.0048. 

10.  Adjustments.  In  practice,  several  cases  may  arise  where  it  is  desirable  to  make  some  adjustments 
to  the  basic  model.  We  describe  some  of  them  here. 

a.  Likelihood  of  any  realisation  within  a  strata  being  too  small.  In  some  cases,  the  lot  a  I 
likelihood  of  any  realization  from  a  particular  strata  may  be  too  small  to  justify  further 
consideration.  An  example  of  this  is  the  third  strata  ( {5} )  discussed  in  the  previous  paragraph,  A 
probability  of  less  than  0.01  is  likely  small  enough  to  ignore  in  our  theater  level  modeling  (this 
threshold  is.  of  course,  a  matter  of  judgment)  In  esses  such  as  this,  we  may  wish  to  simply  run  the 
conventional  theater  model  with  the  modes  from  the  more  likely  (in  the  example,  the  first  and 
second)  strata. 

\ 

b.  The  modes  from  two  strata  arc  outcomes  that  arc  adjacent  to  one  another.  It  is  possible  that 
the  modes  from  two  strata  are  at  the  boundary  of  their  respective  strata,  next  to  the  same  partition, 
and  thus  adjacent  to  one  another  in  terms  of  an  ordered  outcome  space,  An  example  of  this  is  also 
provided  in  the  previous  paragraph,  where  the  modes  from  the  first  two  strata  are  adjacent  to  one 
another  in  terms  of  units  defeated  (one  unit  defeated  in  the  first  stratum  and  two  in  the  second).  In 
order  to  reinforce  our  second  assumption  (different  results  from  different  strata),  we  may  wish  to 
make  a  different  selection  from  one  stratum  or  the  other  in  order  to  avoid  similar  results.  Two 
possible  adjustments  come  to  mind. 
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(1)  The  first  adjustment  is  to  select  the  next  highest  likelihood  from  within  either  stratum  that 
does  not  provide  the  same  number  of  units  defeated  as  does  the  mode.  In  our  example,  we  would 
choose  either  an  outcome  of  zero  units  defeated  from  the  first  stratum  or  three  or  four  units  defeated 
from  the  second  stratum.  The  most  likely  outcome  where  zero  units  are  defeated  is 
q<,(l)*qo(2)*qo(3)*qo(4)-q0(5)  =  0.0864.  The  most  likely  outcome  where  three  or  four  units  are 
defeated  is  qe(l)*qo(2)<p„(5)>po(3)<p,>(4)  =  0.0576.  Since  0.0864  >  0.0576,  we  could  choose  the 
outcome  of  zero  units  defeated  from  the  first  stratum  and  keep  the  outcome  we  previously  computed 
(two  units  defeated)  for  the  second  stratum. 

(2)  The  second  possible  adjustment  is  to  define  partitions  such  that  there  are  “gaps”  between 

the  strata.  In  our  previous  example,  we  might  define  significantly  different  outcomes  coming  from 
zero  or  one  units  defeated,  three  or  four  defeated,  and  five  defeated,  where  the  outcome  of  two  units 
defeated  may  be  an  ambiguous  case  leading  to  either  the  same  result  as  {  0,  I  }  or  {  .3,  4  }  defeated 
units.  This  approach  may  be  more  realistic,  as  the  “transitional  cases”  at  the  boundaries  of  the 
exhaustive  strata  may  lead  to  theater  outcomes  that  are  not  as  clear  cut  as  those  nearer  the  center  of 

any  particular  stratum.  The  only  drawback  to  this  approach  is  the  fact  that  the  total  likelihood  of 

drawing  results  from  any  of  the  strata  will  not  equal  one. 

11.  Repeated  Exchanges.  Until  now,  we  have  assumed  that  there  is  essentially  only  one  nuclear 

exchange  of  interest.  In  other  words,  we  have  assumed  that  the  nuclear  weapons  will  be  employed 

during  a  relatively  small  timeframe  within  the  overall  theater  battle,  and  tlmt  the  theater  battle  will 
be  conventional  thereafter  (at  least  for  the  duration  of  the  conflict  to  be  simulated).  However.  It  Is 
possible  that  a  scenario  may  call  for  repeated  exchanges  of  nuclear  weapons.  We  can  handle  each 
exchange  by  defining  the  outcomes  through  binary  variables  and  stratifying  the  outcome  space  ns 
explained  above,  However,  constructing  an  experimental  plan  with  a  reasonable  number  of  runs  of 
the  theater  model  becomes  difficult.  The  difficulty  rises  from  the  total  number  of  possible 
combinations  of  individual  exchange  outcomes,  even  if  only  a  few  strata  are  chosen  for  each 
exchange.  For  example,  only  three  exchanges  with  only  three  significantly  different  outcomes  (strata) 
predicted  per  exchange  will  lead  to  33  =  27  different  possible  outcomes  after  all  three  exchanges.  It 
is  probably  too  expensive  to  execute  this  many  runs  of  a  theater-level  simulation  model. 

To  handle  such  a  situation,  we  begin  by  determining  the  probability  of  defeating  each  theater- 
level  unit  and  partitioning  the  set  of  all  possible  outcomes  as  explained  previously.  We  can  diagram 
the  27  possible  outcomes  for  our  example  ns  shown  below  in  Figure  1.  If  27  runs  are  too  many  to 
execute  on  our  theater  level  simulation,  then  we  must  select  a  smaller  subset  of  the  27  outcomes  to 
actually  use.  The  question  is,  of  course,  which  subset  do  we  pick?  A  stochastic  simulation  will 
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randomly  select  paths  through  the  “tree”  (Figure  1)  by  selecting  individual  exchange  outcomes 
randomly  according  to  their  likelihoods.  When  a  stochastic  simulation  is  run  multiple  times,  the 
paths  with  a  high  probability  of  occurrence  will  be  selected  multiple  times  and  the  paths  with  a  low 
probability  of  occurrence  will  be  selected  infrequently  if  at  all.  The  result  is  a  weighted  set  of 
outcomes  that  can  be  used  to  estimate  the  distribution  of  the  actual  outcome  after  three  exchanges. 
In  our  case,  we  cannot  even  afford  to  run  the  model  once  for  each  possible  outcome,  much  less 
multiple  times.  However,  we  have  the  same  objective  of  trying  to  determine  a  set  of  outcomes 
corresponding  to  particular  paths  that  can  be  weighted  to  estimate  the  distribution  of  the  actual 
outcome  after  three  exchanges. 


Figure  1.  Possible  Outcomes  from  Three  Exchanges  with  Three  Strata  Each 

Following  the  example  diagrammed  in  Figure  1,  let  us  label  the  strata  at  each  exchange  as  high 
(II),  medium  (M),  and  low  (L)  corresponding  to  some  exchange  outcome  along  some  measure  (c.g., 
total  units  defeated).  We  can  bound  the  outcome  using  the  extreme  choices  at  each  decision  point  in 
our  trees  i.e„  II II II  for  an  upper  bound  and  LLL  for  a  lower  bound.  We  can  also  choose  an 
intermediate  outcome  (MMM)  in  this  ease  by  choosing  the  intermediate  result  at  each  decision  point 
(note  that  there  may  not  always  be  a  clearly  defined  "middle”).  Beyond  this,  we  need  some  sort  of 
rationale  for  selecting  particular  outcomes  out  of  the  27  possible,  it  Is  important  to  note  that  tin1 
variables  are  nested.  For  example,  the  middle  outcome  from  a  second  strike  following  a  high 
outcome  from  the  First  exchange  (IlM)  will  be  different  from  the  middle  outcome  from  a  second 
strike  following  a  low  outcome  from  the  First  exchange  (LM),  because  the  force  strengths  surviving 
the  first  exchange  (and  thus  the  subsequent  theater  battle  before  the  second  exchange)  are 
significantly  different. 

Several  approaches  come  to  mind,  both  qualitative  and  quantitative.  Qualitative  approaches  will 
choose  outcomes  according  to  the  strata:  for  example,  alternating  sequences  such  as  II. ML.  L M II,  and 
M L II  could  be  chosen. 


85 


Quantitative  approaches  will  look  at  the  probability  assigned  to  each  stratum.  For  purposes  of 
illustration,  assume  that  the  probability  for  the  outcomes  (  H,  M,  L  )  are  (  .2,  ,5  .3  )  respectively, 
and  that  the  probability  for  H,  M,  and  L  are  identical  for  each  of  the  three  exchanges  (in  reality, 
this  would  be  unlikely  but  it  suffices  for  Illustration).  We  select  our  runs  according  to  their 
probabilities.  For  example,  the  most  likely  outcome  will  be  MMM  with  probability  (,5)3  =  0.125. 
The  next  most  likely  are  LMM,  MLM,  and  MML  with  probability  (.5)3(.3)  =  0,075,  etc.  We  can 
concentrate  on  choosing  the  outcomes  with  the  greatest  likelihood  (possibly  in  addition  to  the 
bounds  HHH  and  LLL). 

Interpreting  the  output  becomes  more  difficult  when  we  run  only  a  subset  of  all  possible 
outcome  strata.  In  our  standard  experimental  plan,  we  run  all  possible  outcome  strata  and  weight 
the  result  with  the  probability  associated  with  the  strata.  If  we  do  not  make  any  adjustments  (such 
as  defining  non-adjacent  strata),  the  probabilities  of  a  realization  coming  from  a  stratum  will  sum  to 
1.  When  we  select  a  subset  of  outcome  strata,  the  associated  probabilities  will  not  sum  to  1.  We 
recommend  normalizing  the  probabilities  associated  with  the  outcomes  selected  and  proceeding 
accordingly.  An  example  should  make  this  clear. 

12.  Repeated  Exchanges  -  an  Example.  Suppose  we  have  three  exchanges  with  three  significantly 
different  outcomes  (strata)  II,  M,  L  with  probabilities  .2.  .5,  .3  respectively  as  stated  previously,  A 
possible  selection  scheme  might  be  the  following. 

(1)  Select  the  upper  and  lower  hounds  II II II  and  LLL.  The  associated  probabilities  are  II II II  = 
(.2)*  =  0.008  and  LLL  =  (.3)3  =  0.027, 

(2)  Select  the  middle  (qualitative)  or  modal  (quantitative)  outcome.  In  this  case,  they  are  'lie 
same  (MMM)  with  probability  (.5)'1  =  0.125. 

(3)  Select  the  next  most  likely  outcomes  LMM,  MLM.  and  MML.  The  associated  probabilities 
are  equal  at  ( .5)a(  .3)  =  0.075.  Alternatively,  some  type  of  alternating  strata  sequence  could  be  used. 

This  forms  a  subset  of  (5  outcomes  out  of  the  27  possible.  The  total  probability  of  a  realization 
coming  from  any  of  the  6  selected  outcomes  is  0.008  -t*  0.027  +  0.125  +  ( 3 )( 0.07 5 )  =  0.385.  '1  lie 
normalized  probabilities  are  therefore: 

mm  =  =  0.021 

U.I,  =  Ifj  -  ii.orii 


86 


MMM  =  S  =  0'M5 

LMM,  MLM,  MML  =  =  0.195. 

This  sums  to  1.001  due  to  rounding  error. 

In  this  example  we  would  execute  six  runs  of  the  theater-level  simulation  model,  selecting 
realizations  from  the  strata  associated  with  each  exchange  as  indicated  above  (for  example,  MLM 
would  select  from  the  middle  stratum  for  the  first  and  third  exchange,  and  the  lower  stratum  in  the 
second).  The  theater-level  model  output  associated  with  each  realization  selected  can  be  weighted 
with  the  normalized  probability  of  occurrence. 

Note  that  we  only  account  for  38.5  percont  of  the  possible  outcomes  in  terms  of  probability.  As 
a  result,  our  estimates  made  from  only  six  runs  will  not  be  as  good  as  those  produced  from  a  larger 
subset  from  the  27  possible. 

13,  Averaging  the  Results.  To  continue  our  example,  suppose  that  an  outcome  for  some  particular 
measure  from  a  theater  conventional  model  such  as  FORCEM  was  125  for  a  run  using  input,  from 
the  first  stratum,  75  for  a  run  from  the  second  stratum,  and  25  for  a  run  from  the  third  stratum.  An 
average  value  for  this  measure  would  be  derived  from  weighting  the  output  from  a  given  run  with 
the  total  probability  of  any  realization  coming  from  within  the  stratum.  In  out  example,  we  have 
(125)(.381(J)  +  (75)(.(J13(5)  +  (25)(.0048)  =  1)3.84,  This  value,  along  with  the  range  of  values 
produced  by  the  three  different  runs  (summarized  perhaps  witli  a  weighted  variance  or  other 
statistic),  should  be  much  more  meaningful  than  the  value  obtained  by  running  FORCEM  only  for 
some  arbitrarily  chosen  input  set  for  the  nuclear  exchange  outcome. 

However,  a  word  of  caution  is  necessary.  We  started  with  the  assumption  t hat-  there  is  more 
titan  one  significantly  different  outcome  in  the  theater  context!  in  our  example,  there  were  three.  A 
single  summary  measure,  such  as  the  average,  does  not  reflect  this  reality.  Even  a  sample  average 
and  variance  will  not  inform  a  decisionmaker  about  tltc  possible  outcomes  along  witli  their 
associated  probabilities,  Since  the  total  number  of  runs  of  the  theater  conventional  model  will  be  (by 
necessity)  small,  we  recommend  reporting  all  of  the  results,  accompanied  perhaps  witli  a  summary 
measure,  In  cases  of  tactical  nuclear  warfare,  we  are  often  concerned  with  relatively  unlikely  events 
(such  as  the  exchange  itself)  that  nevertheless  have  a  very  significant  impact.  Averaging  obscures 
this  fact,  and  can  lend  a  decisionmaker  astray. 
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14.  Summary.  Using  a  deterministic,  expected  value  approach  to  model  a  real-world  situation  such 
as  theater-level  combat  poses  problems  in  selecting  input  data.  A  deterministic  simulation  demands 
a  single  input  data  set  for  a  model  run,  while  the  data  may  have  to  represent  a  process  that  is 
inherently  stochastic.  An  example  is  provided  in  this  paper.  The  results  of  a  tactical  nuclear 
exchange  within  a  theater  is  inherently  stochastic,  driven  by  random  events  such  as  target 
acquisitions.  An  “average”  exchange  outcome  cannot  properly  be  defined;  an  average  falls  to  exist  in 
subset  selection  problems  (for  example,  If  20  units  out  of  50  are  acquired  on  the  average,  which  20 
are  to  be  selected  as  acquired  in  the  deterministic  model?)  Even  where  averages  can  be  defined,  they 
fail  to  reflect  important  variations  In  possible  outcomes  that  may  make  a  difference  between  winning 
and  losing  the  war  in  a  theater  simulation. 

Ideally,  a  theater-level  stochastic  model  would  be  used  to  properly  reflect  uncertainties  Inherent 
in  the  data  and  processes  represented  by  the  model.  However,  the  current  state  of  the  art  In 
hardware  and  software  only  permit  us  (at  present)  to  model  combat  at  the  theater  in  a 
deterministic,  low-resolution  mode.  Thus,  we  must  reconcile  the  need  to  provide  an  Input  to  those 
deterministic  models  with  the  reality  of  random  outcomes. 

4 

If  there  are  approximately  10*  potential  nuclear  targets  in  a  theater,  there  are  210  possible 
outcomes  that  can  occur  in  terms  of  the  defeat  or  failure  to  defeat  each  potential  target.  Even  if  wo 
look  only  at  the  defeat  or  failure  to  defeat  the  low  resolution  nggregnte  units  represented  In  our 
theater  model,  we  still  have  on  the  order  of  210  possible  outcomes.  A  classical  experimental  design 
approach  that  requires  at  least  one  run  per  variable  obviously  cannot  be  applied.  The  challenge, 
then,  is  to  construct  a  plan  that  minimizes  the  number  of  different  Input  data  sets  yet  fully  reflects 
the  range  of  possible  outcomes  of  the  theater  nuclear  exchange. 

This  paper  outlines  an  approach  to  constructing  such  an  experimental  plan.  We  begin  with  the 
probability  of  defeating  a  potential  nuclear  target  ®nd  determine  trom  that  the  probability 

of  defeating  the  aggregate  units  represented  in  our  theater  model  (such  as  divisions).  We  can 
characterize  all  possible  outcomes  of  the  exchange  as  sets  of  binary  variables,  where  each  binary 
variable  reflects  the  defeat  or  failure  to  defeat  each  unit.  We  then  partition  the  outcome  space  into 
strata  such  that  outcomes  from  different  strata  lead  to  significantly  different,  results  in  the  theater 
battle,  and  all  significantly  different  outcomes  are  included  in  some  stratum.  Our  experimental  plan 
consists  of  a  nuclear  exchange  realization  from  each  strata  that  corresponds  to  the  most  likely 
outcome  within  that  stratum.  The  theater-level  model  is  run  using  the  experimental  plan  to 
determine  the  appropriate  input  data  set  to  use  to  reflect  the  outcome  of  a  theater  nuclear  exchange. 
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15.  Directions  for  Future  Research.  The  techniques  outlined  in  this  paper  form  only  a  start  at  trying 
to  resolve  the  issue  of  how  to  handle  uncertainty  in  input  to  large,  complex  expected  value  models. 
They  are  presently  limited  to  input  processes  that  can  be  summarised  in  a  reasonable  number  of 
binary  variables,  where  it  is  possible  to  make  a  judgment  about  the  type  of  expected  value  model 
output  given  sets  of  similar  input  realizations.  Nevertheless,  it  is  a  step  in  the  right  direction.  At 
present,  it  is  not  infrequent  to  find  studies  based  on  a  single  model  run  per  input  scenario,  without 
any  estimate  of  the  variability  possible  in  the  results  obtained. 

Possible  future  research  topics  Include  extending  the  techniques  to  processes  that  can  be 
expressed  in  various  states,  the  number  of  such  states  exceeding  two.  Better  ways  of  estimating 
partitions  of  the  sample  space  may  also  be  developed.  A  very  realistic  case  in  many  theater  scenarios 
involves  repeated  realizations  of  random  processes  (in  the  context  of  the  nuclear  exchanges  discussed 
in  the  paper,  this  would  imply  many  small  weapon  exchanges  over  a  relatively  long  period  of  time). 
At  present,  we  have  no  satisfactory  way  of  handling  this  situation.  Robust  experimental  plans  that 
can  provide  meaningful  results  over  a  large  number  of  repeated  realizations  will  be  be  necessary  to 
model  such  scenarios. 
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Abstract 

Stress  analysis  of  the  human  femur  involves  uncertainties  in  material  proper¬ 
ties,  geometry,  loads  and  boundary  conditions.  It  is  desired  to  propagate  these 
uncertainties  through  the  Finite  Element  Method  of  stress  analysis  in  order  to 
obtain  the  distributions  of  stresses  and  displacements  in  the  femur.  This  would 
provide  better  insight  into  bone  behavior  and  the  design  of  bone  implants. 

In  particular,  data  from  CT  scans  is  currently  used  to  estimate  the  Young’s 
modulus  of  bone.  The  CT  number  at  any  point  within  the  cross-section  is  used 
to  estimate  the  apparent  density  at  that  point  by  means  of  a  linear  relationship. 
Using  experimental  data  published  by  previous  researchers,  Young’s  modulus 
is  related  to  apparent  density. 

Randomness  in  stresses  and  displacements  can  be  studied  by  either  a  First 
Order-Second  Moment  method  or  by  simulation.  This  paper  compares  the  accu¬ 
racy  of  FOSM  with  that  of  simulation  for  a  simple  deterministic  2-dimensional 
geometry.  It  is  observed  that  second  moment  analysis  can  be  adequate  for 
predicting  accurately  the  first  two  moments  of  the  structural  response. 

Randomness  in  loading  is  much  easier  to  analyze  as  compared  to  randomness 
in  Young’s  modulus  because  stresses  and  displacements  are  linear  functions  of 
the  applied  loads.  This  paper  compares  the  relative  importance  of  randomness 
in  loading  to  randomness  in  Young’s  modulus.  Numerical  experiments  with 
random  material  properties  show  that  randomness  in  Young’s  modulus  has 
little  influence  on  the  randomness  in  stress  when  loading  is  also  random. 
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1  Introduction 


A  “standard”  Finite  Element  Analysis  assumes  all  input  information  to  be  determin¬ 
istic.  In  particular,  loads,  geometry,  material  properties  and  boundary  conditions  are 
assumed  by  the  analyst,  to  be  known  precisely.  Consequently,  the  results  of  such 
an  analysis  are  also  deterministic.  In  reality,  there  is  considerable  variability  in  this 
input  data.  This  randomness  affects  the  structural  response.  Frequently,  designers 
use  a  ‘factor  of  safety’  to  offset  their  lack  of  knowledge  of  the  probabilistic  aspects  of 
the  response. 

Stochastic  FEM  models  uncertain  input  information  by  means  of  random  vari¬ 
ables.  The  first  two  moments  of  the  structural  response  can  be  obtained  by  a  First 
Order  Second  Moment  method.  Such  a  method  can  provide  more  detailed  information 
regarding  the  response  as  compared  to  the  deterministic  finite  element  method. 

Finite  element  analysis  of  the  femur  is  currently  being  performed  assuming  deter¬ 
ministic  input,  in  spite  of  experimental  evidence  suggesting  considerable  randomness 
in  this  input  data.  A  study  of  the  effect  of  randomness  in  loading  and  material  prop¬ 
erties  would  help  evaluate  the  accuracy  of  the  deterministic  solution.  This  paper 
deals  with  the  effect  of  randomness  in  loading  and  material  properties  on  a  simple 
2-dimensional  model  of  the  proximal  femur. 

2  Probabilistic  Structural  Analysis 

Probabilistic  Structural  Analysis  deals  with  analysis  of  structures  in  the  presence  of 
uncertainty.  It  can  be  used  to  calculate  the  first  two  moments  or  the  distribution 
functions  of  the  structural  response.  Structural  reliability  theory  aims  at  calculating 
the  probability  of  failure  for  structural  systems.  Since  there  are  no  closed-form  ex¬ 
pressions  for  stresses  and  displacements  obtained  by  a  finite  element  analysis,  Monte 
Carlo  simulation  [Shin  72]  must  be  used  to  determine  the  distributions  of  the  re¬ 
sponse.  Since  realistic  structural  analysis  problems  tend  to  be  computationally  inten¬ 
sive  and  that  detailed  probabilistic  information  regarding  the  random  input  data  is 
rarely  available,  the  approximate  technique  of  First  Order  Second  Moment  (FOSM) 
method  is  sometimes  more  suitable  for  stochastic  finite  element  analysis. 

Some  of  the  earliest  work  in  this  field  dealt  with  eigenvalue  problems  involving 
random  media  [Coll  69].  Subsequently,  stochastic  finite  element  analysis  has  also  been 
applied  to  beams  with  random  rigidity  [Vanm  83b],  turbopump  blades  [Nagp  87],  etc. 

There  are  several  methods  of  modeling  randomness  in  material  properties  such  as 
Young’s  modulus.  Vanmarcke  [Vanm  83a]  suggested  modeling  the  random  Young’s 
modulus  field  as  a  spatially  varying  stochastic  process.  The  Young’s  modulus  for  a 
finite  element  can  then  be  obtained  by  an  averaging  of  the  stochastic  field  over  the 
finite  element.  Liu  [Liu  86]  modeled  the  Young’s  modulus  within  an  element  by  a 
linear  combination  of  random  Young’s  moduli  at  the  nodes  of  the  element.  Yamazaki 
[Yama  88]  considered  the  Young’s  moduli  at  centroids  of  finite  elements  as  random 


variables.  Der  Kiureghian  [Kiur  88]  compared  the  averaging  method  with  the  centroid 
method  and  observed  that  these  two  methods  tend  to  bound  the  exact  response 
variability;  the  centroid  method  usually  over-estimates  the  variability  whereas  the 
averaging  method  usually  under-estimates  it. 

3  Analysis 

There  is  considerable  variability  in  the  input  data  for  structural  analysis  problems  in 
biomechanics.  Young’s  modulus  in  bone  is  currently  estimated  using  CT  (Computed 
Tomography)  scans.  The  grey  value  from  these  scans  is  used  to  estimate  the  apparent 
density  by  a  linear  relationship.  The  apparent  density  is  related  to  the  Young’s  mod¬ 
ulus  by  an  experimentally  determined  non-linear  relationship.  There  is  considerable 
variability  in  this  experimental  data.  Therefore  finite  element  models  of  the  proximal 
femur  have  Young’s  moduli  which  are  not  deterministic.  The  grey  values  in  a  CT  Bean 
are  used  to  determine  the  geometry.  Distinction  between  bone  and  tissue  is  is  based 
on  a  threshold  which  is  chosen  subjectively  by  the  analyst.  Hence  the  size  of  the  bone 
being  analyzed  is  not  deterministic.  Moreover  the  exact  location  and  magnitudes  of 
loads  are  not  known  precisely. 

The  results  of  a  finite  element  analysis  are  affected  by  all  these  random  inputs. 
Stochastic  finite  element  analysis  can  be  used  to  determine  the  amount  of  randomness 
in  the  response.  Structural  reliability  can  be  used  to  determine  the  probability  of 
failure.  But  in  structural  analysis  of  biomechanical  systems,  where  the  modeling 
uncertainties  and  approximations  are  high,  a  reliability  index  or  a  probability  of  failure 
could  be  very  inaccurate.  Modeling  approximations  include  use  of  linear  elastic  finite 
element  analysis  instead  of  non-linear  visco-elastic  finite  element  analysis,  isotropic 
material  models  instead  of  transversely  isotropic  material  models,  etc. 

The  present  study  was  aimed  at  comparing  simulation  and  FOSM  for  finite  ele¬ 
ment  analysis  of  the  proximal  femur.  Also,  the  relative  importance  of  randomness 
in  material  properties  and  loading  was  also  studied.  A  typical  coarse  3-D  finite  el¬ 
ement  model  for  the  proximal  femur  contains  about  300  elements  and  1200  nodes. 
Stochastic  finite  element  analysis  of  such  problems  is  therefore  too  expensive.  Hence 
it  was  decided  to  analyze  a  2D  plane  strain  model  of  the  proximal  femur  instead. 
Deterministic  analyses  performed  on  both  these  models  indicate  that  the  results  from 
a  2D  model  are  qualitatively  the  same  as  those  obtained  from  a  3D  model. 

The  random  Young’s  modulus  field  was  modeled  using  the  Young’s  modulus  in 
each  finite  element  as  a  random  variable.  Since  the  variability  in  Young’s  modulus  is 
very  high,  uncorrelated  fluctuations  in  Young’s  modulus  in  adjacent  finite  elements 
can  give  very  unrealistic  material  property  distributions.  Therefore  it  was  necessary 
to  assume  that  the  Young’s  moduli  in  different  elements  were  correlated  by  a  spa¬ 
tially  varying  correlation  function.  An  exponentially  decaying  correlation  function  of 
the  form  e~dlL  (where  L  is  the  “correlation  length”)  was  chosen  because  of  the  “intu¬ 
itive”  feeling  that  Young’s  moduli  in  elements  close- by  should  not  vary  independently, 
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Figure  1:  Variation  in  standard  deviation  of  displacements 

whereas  Young’s  moduli  in  elements  far  apart  could  be  almost  uncorrelated. 

Preliminary  analyses  showed  that  correlation  length  plays  a  very  important  role 
in  determining  the  amount  of  randomness  in  the  response.  Figure  1  and  figure  2 
show  the  variation  in  the  standard  deviation  of  displacements  and  stresses  with  the 
correlation  length  for  a  typical  plane-strain  analysis.  With  an  increase  in  correlation, 
stresses  tend  to  become  deterministic  because  stresses  are  independent  of  Young’s 
moduli,  provided  the  Young’s  moduli  are  changed  uniformly  by  a  constant  factor. 
However  the  displacements  in  this  case  have  maximum  variability.  When  there  is 
little  correlation  between  Young’s  moduli,  the  displacements  are  less  random  but  the 
stresses  are  more  random.  There  is  a  considerable  change  in  the  standard  deviation 
of  the  response  from  a  fully  correlated  to  a  fully  uncorrelated  case.  In  order  to 
obtain  accurate  second  moments  of  the  response,  one  must  use  a  correlation  function. 
However,  the  correlation  function  in  this  case  must  be  based  on  experimental  data. 

Figure  3  shows  the  measured  pairs  of  Young’s  modulus  and  apparent  density 
[Cart  77].  The  power  law  relationship  shown  is  currently  being  used  to  predict  the 
Young’s  modulus  given  apparent  density.  However  this  data  cannot  be  used  to  de¬ 
termine  a  correlation  function  because  these  samples  are  uncorrelated  and  their  po¬ 
sitional  data  is  not  available.  Another  experimental  study  made  by  Goldstein  et. 
al.  [Gold  89]  gives  apparent  density  and  Young’s  modulus  for  8  mm  specimens  in 
the  proximal  and  distal  femur.  Figure  4  compares  the  data  presented  in  [Cart  77] 
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Figure  2:  Variation  in  standard  deviation  of  stresaee 
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Figure  3:  Young’s  modulus  -  Apparent  density  relationship  [Cart  77] 
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Figure  4:  Comparison  of  data  in  [Cart  77]  and  [Gold  89] 

and  [Gold  89],  These  two  sets  of  data  do  not  appear  to  be  consistent.  This  can  be 
attributed  to  the  following  : 

1.  The  specimens  in  [Cart  77]  came  from  both  human  as  well  as  bovine  bone. 

2.  [Gold  89]  does  not  contain  any  data  for  cortical  bone. 

3.  [Cart  77]  contains  both  fresh  and  embalmed  s p<  cimens  from  different  investiga¬ 
tors  who  probably  performed  experiments  unuer  different  test  conditions. 

It  was  therefore  decided  to  use  the  positional  data  of  these  specimens  to  esti¬ 
mate  the  correlation  function,  regression  coefficients  and  variance  by  the  method  of 
maximum  likelihood. 

The  following  relationship  was  assumed  to  exist  between  the  Young’s  modulus 
(£)  and  the  apparent  density  (p) 

\n{E)mA  +  B\n(p)  +  t  (1) 

which  can  be  written  as . 

Y  m  A  +  BX  +  e  (2) 
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where  Y  »  ln(2?)  and  X  «  ln(/o);  A  and  B  are  unknown  regression  coefficients  and 
t  ~  N( 0,  or9)  is  a  normally  distributed  random  error.  This  is  consistent  with  the  linear 
regression  on  a  log- log  scale  performed  in  [Cart  77].  332  specimens  were  obtained  from 
the  left  and  right  proximal  and  distal  femurs  of  two  cadavers  ([Gold  89]). 

Therefore  we  have 


!/,■  s  A  +  Bx^  4*  ,  %  —  1  to  332  (3) 

The  following  correlation  function  was  chosen  for  the  random  errors  e,  5s  : 

CO Vfe.ej]*^] (4) 

where  d  is  the  Euclidean  distance  between  the  centers  of  specimens  i  and  j. 

The  above  correlation  function  is  used  with  the  following  restrictions: 

e  There  is  no  correlation  between  the  errors  from  the  proximal  femur  to  the 
distal  femur. 

•  There  is  no  correlation  between  the  errors  e<  from  the  left  leg  to  the  right  leg. 

•  There  is  no  correlation  between  the  errors  «,•  from  one  person  to  another. 

The  problem  can  now  be  stated  in  matrix  form  as  follows  : 

Y  -  X/3  +  c 


(5) 
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S[Y]  m  SPC/SJ  +  £[e]  -  X0 

£[(Y  -  X0)(Y  -  X/3)']  -  £[«']  »  er2V 


(8) 

(9) 


where  V  m  f(L)  and  L  is  the  “correlation  length”.  Maximum  likelihood  estimates 
for  the  parameters  A,  B,  cr  and  L  were  calculated  [Chin  89]. 
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4  Results 


The  maximum  likelihood  estimates  obtained  are  given  below: 

L  as  16.32  mm 
A  m  6.82 

£  =  1.4676  (10) 

Sr  m  0.313 

The  relationship  between  E  and  p  (shown  in  Figure  4)  can  now  be  written  as 

E  m  916  p1"WTV  (11) 

where  e  ~  N(Q,6r2), 

Figure  5  shows  the  distribution  of  Young’s  modulus  in  the  proximal  femur  with 
an  implant.  Titanium  was  chosen  as  the  implant  material  and  its  Young’s  modulus 
(=  110  Mpa)  is  deterministic. 

This  problem  was  solved  using  both  FOSM  and  simulation.  For  any  function  /(&) 
(such  as  displacement  or  stress)  of  the  random  variables  s.  (here,  Young’s  moduli), 
a  Taylor  series  expansion  can  be  performed  about  the  mean  values  of  the  random 
variables: 


/to  -  no  +  (|0  (*-i) 

(12) 

This  yields 

£[/(*)!  -  f(0 

(13) 

v«[/(£)l  =  (|)Tc„(|) 

(14) 

where  Cxx  is  the  covariance  matrix  of  the  input  variables  and  &  is  the  mean  vector, 
The  mean  response  is  thus  the  usual  deterministic  response.  This  analysis  ignores  the 
distribution  function  of  £  and  the  non-linearity  of  /(a).  It  is  however  computationally 
much  faster  than  simulation,  Simulation  and  FOSM  results  on  plane-strain  analyses 
df  the  proximal  femur  indicate  that  FOSM  is  sufficiently  accurate  in  predicting  the 
first  two  moments  of  the  response.  The  error  in  mean  and  standard  deviations  of 
stresses  was  usually  well  under  5  percent.  Figure  6  compares  graphically  the  stan¬ 
dard  deviations  of  the  stress  in  the  inferior-superior  direction  obtained  by  these  two 
methods.  Moreover,  the  marginal  distribution  of  stress  at  any  point  was  very  close 
to  a  Gaussian  distribution.  This  suggests  that  in  spite  of  the  approximations  made 
in  FOSM  analysis,  FOSM  can  be  used  as  a  reliable  alternative  to  simulation. 

The  coefficient  of  variation  can  be  defined  as  : 


coefficient  of  variation  = 


standard  deviation 
mean 


(15) 
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The  coefficient  of  variation  of  stress  as  a  result  of  randomness  in  Young’s  modulus 
was  about  a  third  of  that  of  Young’s  modulus.  This  suggests  that  stresses  are  not  as 
random  as  the  Young’s  moduli. 

Randomness  in  loading  is  easier  to  analy2e  because  stresses  and  displacements  are 
linear  functions  of  the  magnitudes  of  the  applied  loads.  Thus  FOSM  analysis  can 
accurately  calculate  the  first  two  moments  of  the  response.  Moreover,  the  coefficient 
of  variation  of  stresses  (or  displacements)  is  the  same  as  the  coefficient  of  variation  of 
the  applied  loads,  provided  the  applied  loads  are  fully  correlated.  Since  the  applied 
load  is  not  correlated  to  the  Young’s  modulus,  the  resulting  randomness  in  stress 
is  dominated  by  the  randomness  in  loading.  Moreover  if  the  loads  are  Gaussian, 
the  resulting  stresses  and  displacements  will  also  be  Gaussian  and  FOSM  will  again 
produce  accurate  results. 

5  Conclusion 

This  paper  studies  the  effect  of  uncertainties  in  material  properties  and  loading  on 
stresses  and  displacements  in  the  proximal  femur.  Simulation  studies  showed  that  the 
approximate  method  of  First  Order  Second  Moment  analysis  can  predict  accurately 
the  first  two  moments  of  the  response.  The  resulting  marginal  distribution  of  stress 
was  very  close  to  being  Gaussian.  When  the  applied  loads  are  deterministic  and 
the  Young’s  moduli  are  random,  the  coefficient  of  variation  of  stresses  was  found  to 
be  much  less  than  that  of  Young’s  modulus.  Since  stresses  are  linear  functions  of 
the  applied  loads  the  coefficient  of  variation  of  stresses  is  equal  to  the  coefficient  of 
variation  of  the  applied  loads  when  the  Young’s  moduli  are  deterministic.  When  both 
Young’s  moduli  and  applied  loads  are  random,  the  randomness  in  loads  dominates 
randomness  in  Young’s  modulus.  Hence  the  resulting  response  can  be  predicted 
accurately  by  modeling  randomness  in  loading  alone. 
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Abstract 

This  paper  summarizes  the  results  presented  at  the  Army  Research  Workshop  held 
at  Monterey,  CA  in  October,  1989.  A  more  detailed  version  will  appear  elsewhere. 

In  the  age-dependent  minimal  repair  model  of  Block,  Borges,  and  Savits  (1985), 
a  system  falling  at  age  t  undergoes  one  of  two  types  of  repair.  With  probability 
p(t),  a  perfect  repair  is  performed,  and  the  system  is  returned  to  the  ‘good-as-new’ 
state,  while  with  probability  1  -  p(t),  a  minimal  repair  ia  performed,  and  the  sya- 
tem  is  repaired,  but  is  only  as  good  as  a  working  system  of  age  t.  Whitaker  and 
Samaniego  (1989)  propose  an  estimator  for  the  system  life  distribution  F  when  data 
are  collected  under  this  model. 

Using  the  product  integral  representation  of  the  survival  function,  a  basic  result 
of  Block,  Borges,  and  Savits  concerning  the  waiting  time  until  the  first  perfect  repair 
is  extended  to  allow  for  discontinuous  distributions.  Then  using  counting  process 
techniques,  the  large  sample  theorems  of  Whitaker  and  Samaniego  are  extended  to 
the  whole  line.  These  results  are  used  to  derive  confidence  bands  for  F ,  and  to 
determine  a  sufficient  condition  for  their  applicability  on  the  whole  line.  Simulation 
results  for  the  bands  are  provided.  An  extension  of  the  Wilcoxon  two-sample  test  to 
the  minimal  repair  model  is  also  examined. 


1  The  Minimal  Repair  Model 

To  fix  notation,  let  F  be  a  life  distribution,  let  rjr  be  the  upper  endpoint  of  the  support  of  F 
(possibly  infinite),  and  let  A(t)  =*  /(o,«](F(s— ))“,d/'(j)  be  the  cumulative  hazard  function 
of  F,  where  F  *  1  -  F. 

Now,  for  j  «  l,...,n,  let  {X),o  as  0,-Y;(i,JVy,3,...}  be  independent  record  value 
processes  from  F.  These  are  Markov  processes  with  P(Xj,k  >  t  |  . ,-Vj,*_i)  = 

F(t)/F(Xj,k. i) ,  for  t>  Xj,k- 1,  ft  >  1.  If  AF(rr)  >  0,  define  Xjj  =  oo  for  all  /  larger  than 
the  first  ft  for  which  Xj,n  =  r/r.  In  all  cases  we  take  p(tjt)  *  1 .  These  processes  represent 
the  failure  ages  of  n  systems  under  a  “forever  minimal  repair”  scheme. 


104 


Perfect  repair  is  introduced  into  this  model  by  the  use  of  independent  uniform  random 
variables.  This  facilitates  the  construction  of  the  a-field  structure  (filtrations)  necessary 
to  our  analysis  of  the  model  through  martingale  methods.  Thus  we  let  {  £/,,*  :  1  <  ji  < 
n,  k  >  1 },  be  i.i.d.  uniform  r.v.’s,  and  define 

t/j  «  inf{  *  :  «  1 }. 

Thus  observing  { «  l,...,n},  is  equivalent  to  observing  n  indepen* 
dent  copies  of  the  age-dependent  minmal  repair  process  of  Block,  Borges,  and  Savits 
(BBS)(1985),  each  until  the  time  of  its  first  perfect  repair. 

This  structure  provides  us  with  a  concrete  starting  point  for  a  statistical  analysis  of  the 
BBS  model.  However,  we  need  conditions  which  are  sufficient  to  assure  the  finiteness  of 
•*;>>  •  Such  conditions  are  given  by  the  following  result,  which  generalizes  a  result  of  BBS  to 
the  case  of  possibly  discontinuous  F.  Though  this  generalization  may  not  be  important  for 
modeling  system  failures,  it  will  be  useful  to  us  in  proving  large  sample  results.  Also,  the 
proof  of  this  result,  which  we  sketch  below,  is  more  straightforward  than  the  original  proof 
of  BBS.  The  reader  is  referred  to  Hollander,  Proschan,  and  Sethuraman  (1989)  (HPS),  for 
detailed  proofs  of  this  and  other  results  in  this  paper. 

Proposition  1  Let  H(t)  m  P(X„  £t,v<  oc).  Then 

m  -  n(o,iO  -  dK„) 

-  exp(-4./j)^i)5 

Moreover,  if  either 
(i)  AF(rr)  >  0  (and  p(rF)  *  l), 
or 

m  n*r-)  - 1  and  -  +», 

then  H  is  a  proper  distribution  function  and  v  is  almost  surely  finite.  Conversely,  if  H  is 
a  proper  distribution  Junction,  then  either  (i)  or  (ii)  must  hold. 

Proof.  (Sketch)  Note  that 

fi(t)  -  l-P{X,£t,v<oc) 

*  i-iw- <<,«'*;). 

A  conditioning  argument  shows  that 

Tmal  +  E  I  "I  da{ti)‘-da(tj ), 
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where 


a{t)  ~  i,(1 '  p(,))7w  ' 


This  is  equivalent  to 

-  rwi + d<*) *  exp  (°c(0)  no + A°0))» 

where  otlS  is  the  continuous  part  of  a  and  Ao(t)  is  the  jump  in  a  at  t.  Here,  17(0, <]0  +  da) 
represents  a  product  integral.  The  theory  of  product  integration  with  applications  in 
statistics  is  reviewed  in  Gill  and  Johansen  (1987).  The  result  follows  from  the  last  equation 
after  some  algebra.  □ 

We  will  say  that  a  pair  satisfying  either  (i)  or  (ii)  describes  a  regular  repair  scheme. 


2  The  Whitaker- Samaniego  Estimator 

In  this  section,  we  derive  a  martingale  representation  for  the  Whitaker- Samaniego  (1989) 
estimator  (WSE).  This  representation  is  then  used  in  conjunction  with  Rebolledo’s  Mar¬ 
tingale  Central  Limit  Theorem  and  the  techniques  of  Gill  (1983)  to  derive  limit  theorems 
for  the  WSE. 


The  Basic  Martingale 

Define 

and 

Ft  -  9  ({  Nj(a) :  s  <  <,  1  £  ;  £  n }) 

V  e({Uj'k  :  kz  1,1  <;  <n}). 


For  the  rest  of  this  paper,  (/i)*£o  will  serve  as  the  underlying  filtration  for  all  martingales. 
Now  let 

m  -  #{(/,*) :  Xkt  <Uk<0h\<i<n}, 


and 


(>)<!*(•)■ 


In  HPS,  it  is  shown  that  M  is  a  locally  square-integrable  martingale  with  predictable 
quadratic  variation  given  by 


MW-  [Y(s)(l-AA(s))dA(s). 


(1) 


This  provides  the  basic  martingale  structure  for  further  analysis  of  the  minimal  repair 
model. 
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A  Martingale  Representation  for  the  WSE 

Assume  that  F  is  continuous  and  that  the  pair  (F,p)  describes  a  regular  repair  scheme. 
Let  X(k)  be  the  ordered  value  of  the  set  {  Xj,k  >  h  <  <  j  <  n },  let 

T  =  min{X(k):Y(X(k))ml}, 


and  let  J(s)  *  /( $  <  T),  Then  the  Whitaker-Samaniego  estimator  (WSE)  can  be  written 


as 


where 


ho  -  n,.,](i  -  iX) -  n  (i  -  aa(.))  , 


A(t)  m  (  dfV(j). 


'<M  V'(j) 

Using  Duhammel’s  equation  (Gill  and  Johansen,  1989),  (P  —  F)/P  can  be  expressed 
as  an  integral  with  respect  to  the  martingale  M : 


From  this  and  (1)  it  follows  (P  —  F)jP  is  itself  a  locally  square-integrable  martingale  with 
predictable  quadratic  variation  process  given  by 


a  dF(s) 

mmr 


This  quadratic  variation  process  essentially  serves  to  identify  the  covariance  structure  of 
the  limiting  Gaussian  processes  derived  in  the  next  section. 


Large  Sample  Results 

With  the  above  representation,  Rebolledo’s  martingale  CLT  and  the  methods  of  Gill(1983) 
yield  the  following  result,  which  extends  Theorem  3.3  of  Whitaker  and  Sam&niego  (1989) 
to  the  whole  line. 

Theorem  1  Let  (F,p)  describe  a  regular  repair  scheme,  with  F  continuous.  Then  the 
following  hold: 

(i)  As  n  -4  oo, 

v/n  (F  _  F)  £  P  •  B(C)  in  D[ 0,  oo], 
where  B  is  Brownian  motion  on  [0,oe),  and 

c{t]  ■  /:  ■ 
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(ii)  At  n  -»  oo, 


vK  J (P  -  F)  3,  &>(K)  in  D[ 0,  oo], 

where  B°  ia  Brownian  bridge  on  [0, 1],  and  K  ■  C/(  1  +  C). 

Details  of  the  proof  of  this  theorem  are  given  in  HPS.  We  note  here  that  the  proof  of 
(i)  does  not  require  any  additional  conditions  beyond  regularity  of  the  repair  scheme.  This 
is  in  contrast  with  the  analogous  result  of  Gill  (1983)  for  the  Kap! an- Meier  estimator  in 
the  usual  censored  survival  data  model,  where  some  condition  on  the  amount  of  censoring 
is  needed.  We  will  see  below  however,  that  an  additional  condition  limiting  the  amount  of 
imperfect  repair  is  needed  to  assure  convergence  of  the  expression  in  (ii)  when  an  estimate 
is  substituted  for  R / F . 

3  Applications 

In  this  section,  the  asymptotic  results  of  the  last  section  are  used  to  derive  large  sample 
confidence  bands  for  F  and  to  obtain  the  limiting  distribution  of  an  extension  of  the 
Mann- Whitney- Wilcoxon  test  statistic  to  the  minimal  repair  model. 

Confidence  Bands 

The  result  in  part  (ii)  of  Theorem  1  suggests  confidence  bands  based  on  the  distribution 
of  the  supremum  of  Brownian  bridge.  It  is  necessary  however  to  estimate  R/F  in  order 
to  construct  the  bands.  Let  It  be  the  empirical  cdf  of  the  and  let  R  -  <?/(l  +  <?), 
where  C  is  defined  by 


We  would  like  to  have 

—  ~R 

y;(P  —  F)  -2*  &(K)  in  D[0,oo],  as  n  — ►  oo,  (2) 

in  order  to  justify  asymptotic  (1  -  o)  x  100%  confidence  bands  for  F  of  the  form 

P  ±  y/ZXaP/R, 

where  is  the  upper  oth  quantile  of  the  distribution  of  sup  |5°(<)|. 

We  can  show  that  (2)  holds  on  (0,  r]  for  any  r  <  rj r,  but  for  the  complete  result,  some 
additional  condition  seems  to  be  needed.  Using  the  result  of  Prop.l,  it  is  shown  in  HPS 
that  R/F  and  R/P  ar<  nondecreasing  and  that 

,  *  R B.  ,  „  t  .  S. 

1S7  -77  “d 
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Using  this,  it  can  be  shown  that  a  sufficient  condition  for  (2)  is  that 

This  condition  requires  that  p(f)  — ►  1  as  t  f  rjr  (at  a  rate  sufficient  for  the  convergence  of 
the  integral),  and  hence  provides  a  limit  on  the  amount  of  imperfect  repair. 

Simulation  results  for  the  bands  computed  over  finite  intervals  (in  the  case  of  constant 
p )  indicate  that  coverage  probabilities  are  quite  good  for  sample  sixes  of  50  or  more.  This 
will  of  course  vary  with  the  parameters  of  the  model.  Simulations  were  carried  out  with 
both  Gamma  and  Weibull  with  varying  shape  parameters,  and  with  various  values 
of  p,  various  interval  lengths,  and  various  nominal  confidence  levels.  As  an  example,  the 
following  table  gives  the  simulated  coverage  probabilities  for  nominal  95%  confidence  bands 
over  the  interval  [0, 4.744]  when  the  underlying  F  is  Gamma  with  shape  parameter  2.  (Note 
that  4.774  is  the  ninety-fifth  percentile  of  Gamma(2).)  More  extensive  tables  are  provided 
in  HPS. 


n 

p  m  ,50 

pm.il 

p  *■  .10 

10 

.9025 

,8660 

.6710 

20 

.9270 

.9125 

.9187 

30 

.9460 

.9287 

.9327 

50 

.9515 

.9398 

.9395 

100 

.9528 

.9540 

.9452 

200 

.9515 

.9517 

.9495 

An  Extension  of  the  Mann-Whitney-Wilcoxon  Test 

Using  part  (i)  of  Theorem  1,  it  is  also  possible  to  obtain  the  limiting  distribution  for 
an  adaptation  of  the  Mann-Whitney-Wilcoxon  two-sample  statistic  to  the  minimal  repair 
model.  Here  we  assume  that  for  i  *  1,2,  we  observe  n,‘  BBS  processes  from  (F,,p,),  each 
until  its  first  perfect  repair.  In  general  we  wish  to  test  the  null  hypothesis  H0 :  F\  *»  Fj, 
with  typical  one-sided  alternatives  specifying  JF\dFt  >  1/2,  and  two-sided  alternatives 
specifying  JF\  dF%  1/2, 

A  statistic  analogous  to  the  Mann- Whitney  form  of  the  Wilcoxon  two-sample  test 
statistic  is  W,  as  given  by 


W 


where  Pi  is  the  WSE,  A. V,(s)  is  the  number  of  failures  at  age  a,  and  >'(*)  *  the  num¬ 
ber  of  items  at  risk  at  age  s  in  the  Ith  sample.  This  statistic  is  a  natural  estimator 
of  fFi  dFi  =  P(X i  <  A’a),  where  Xx  and  X3  are  independent  random  variables,  with 
Xi  ~  Ft.  Assuming  continuous  distributions,  P{X\  Xj)  «  1/2  under  Ho,  and  in  the 
one-sided  case,  significantly  large  values  of  W  provide  evidence  against  Ho  in  the  direction 
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of  fFi  dF3  >  1/2.  For  Urge  sample  sizes,  we  have  the  following  result,  which  is  proven  in 
HPS: 


Theorem  2  If  F\  and  Fj  are  continuous,  and  the  pairs  (Fi,pi)  and  (jPa, pa)  describe  reg¬ 
ular  repair  schemes,  and  if  n^nj  -+  oo  in  such  a  way  that  — ►  A,  0  <  A  <  1, 

then  r  ,  y  ,  ,  v 

VS7TSJ  [l V-JF,  iF^  s,  N  (o ,  ifff  +  j-Lj* j)  ,  (3) 

where 

A  -  2jf  J'" 

A  -  Jjf Jf 

Under  the  null  hypothesis,  Ho  :  Fj  ■  f  ■  Fj, 

»?  -  S  jf  f(<)C.(i)  Of  f  W «"(»))  dp(t)  -  J  jf  <U"(.) . 

For  purposes  of  testing  the  null  hypothesis  in  the  Urge  sample  case,  we  thus  propose 
referring  the  test  statistic 

*"(,r-|)/(S+2)* 

to  a  standard  normal  distribution,  where 

tf-1  r  j*(,)  dpi,)-!1  t.  nl? (>)$(-) 

‘  *J°  I(7-j  *  w*) 


and  /f,  is  the  empirical  distribution  of  the  perfect  repair  ages  in  the  i‘h  sample. 

It  is  shown  in  HPS  that  the  Oi  are  consistent,  which  justifies  the  use  of  this  test.  If 
the  pi  are  constants  (see  Brown-Proschan  (1983)),  the  above  expressions  simplify  greatly 
under  H0.  If  Fv  m  Fa  »  F,  then  fit,  m  /*,  and  the  asymptotic  variance  in  (3)  reduces  to 


r?+ 


1  -  A*’  “  X  (4(4 -pi))  i"  1  -  A  (4(4  -  p3))  * 


v4(4  -pi), 

The  pf* s  are  of  course  consistently  estimated  by  their  MLE’s,  pi,  the  ratio  of  n<  to  the  total 
number  of  failures  in  the  Ith  sample,  and  for  large  samples,  the  statistic  Z\  given  by 

/  i\/f  1  1  l1/* 

Z'm  (\ 


W 


-8/[: 


1  ,  1  1 

<»i(4 -A)  «"•(«- A)J 


can  be  referred  to  a  standard  normal  distribution  in  order  to  test  the  null  hypothesis.  Note 
also  that  if  pt  m  pa  m  1,  then  we  are  in  the  usual  i.i.d.  two-sample  model,  the  WSE’s 
reduce  to  the  empirical  c.d.f.’a,  and  W  is  just  a  multiple  of  the  Mann- Whitney  form  of  the 
Wilcoxon  rank-sum  statistic.  In  this  case,  the  above  results  yield 


in  agreement  with  the  usual  results  for  the  Mann- Whitney- Wilcoxon  test. 
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THE  APPLICATION  OF  A  COMPOSITE  DESIGN  TO 
TEST  A  COMBAT  SIMULATION  MODEL 
Carl  B.  Bates 

US  Army  Concepts  Analysis  Agency 
Bethesda,  Maryland  20814-2797 

ABSTRACT.  A  study  Is  to  be  performed  that  involves  the  determination  of  a 
mix  of  target  acquisition  systems  that  yields  an  improved  capability  at  a 
lesser  cost.  A  primary  candidate  for  the  combat  simulation  is  a  two-sided 
deterministic  division-level  ground  combat  model.  Before  the  model  could  be 
used  In  the  study,  the  model  had  to  be  tested  to  determine  its  capability  to 
evaluate  the  combat  effectiveness  of  mixes  of  target  acquisition  systems. 

The  test  involved  four  factors,  one  qualitative  and  three  quantitative 
factors.  Time  constraints  limited  the  number  of  simulations  to  30  runs.  A 
composite  design  is  presented.  Its  application  is  illustrated,  and  its 
efficiency  is  discussed. 

1.  INTRODUCTION.  The  test  was  to  assess  the  sensitivity  of  model 
output  to  specified  changes  in  input  values  for  the  four  selected  input 
factors.  The  four  factors  are: 

TYP  -  Type  of  sensor, 

FRC  -  The  fraction  of  target  elements  for  which  the  sensor  has  both 
coverage  and  line-of-sight, 

TIM  -  The  time,  in  minutes,  that  a  sensor  spends  processing  and 
reporting  a  target  it  has  detected, 

NUM  -  The  total  number  of  sensors  employed  in  a  model  run. 
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Preceding  Page  Blank 


Two  types  (A  and  B)  of  sensors  were  to  be  evaluated.  Three  values  were 
ultimately  selected  for  each  of  the  three  quantitative  factors.  The  minimum 
and  maximum  from  operational  performance  were  taken  as  the  lower  and  upper 
values.  A  "middle"  value  was  then  added.  The  values  ares 

FRC  -  0.1,  0.5,  0.9, 

TIM  -  0,  5,  10. 

NUM  -  5,  15,  25. 


This  gave  a  2x3x3x3  full  design.  Time  constraints,  however,  would  permit 
only  30  runs  for  the  complete  test. 

2.  EXPERIMENTAL  DESIGN.  Therefore,  the  objective  Is  to  develop  an 
experimental  design  with  not  more  than  30  design  points.  The  design  should 
permit  assessment  of  a  full  second-order  model  In  the  three  quantitative 
factors.  Draper  and  John  (1988)  discuss  response  surface  designs  for 
quantitative  and  qualitative  variables.  They  give  some  first  and  second- 
order  designs  for  2k  factorials  and  2^-P  fractional  factorials.  The  decision 
was  made,  however,  that  a  single  model  Involving  TYP  had  no  advantage  over 
two  models,  one  for  each  of  the  two  types  of  sensors.  Now  the  problem  Is  to 
develop  a  response  surface  design  (one  of  each  sensor  type)  for  the  three 
three-level  quantitative  factors. 

Let  the  three  variables  Xi,  X2.  and  X3  represent  the  three  quantitative 
factors.  The  second-order  model  we  wish  to  Investigate  Is: 


y~  P0+  +  P^a  +  ^3^3  +  Pu^l +  +  ^33^ 3  +  P  1Z*1*2+  P  13*1^3  +  ^ 23^2^3  +  6 
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The  27  design  points  of  the  full  3x3x3  design  are  shown  In  Figure  1.  The 
low,  middle,  and  high  values  of  the  three  variables  are  denoted  by  "0",  "1", 
and  "2",  respectively.  The  eight  corner  points,  (000),  (200),  (020),  (220), 
(002),  (202),  (022),  and  (222),  would  be  a  full  23  design  If  there  were  no 
middle  values.  If  these  eight  points  are  augmented  with  the  center  point 
(111)  and  the  six  center  points  of  each  plane,  (211),  (Oil),  (101),  (121), 
(110),  and  (112),  we  have  a  design  similar  to  a  central  composite  design. 

The  design  Is  given  In  Table  1  and  Illustrated  In  Figure  2.  Box  and  Wilson 
(1951)  Introduced  the  concepts  of  composite  designs.  Myers  (1971)  and  Box 
and  Draper  (1987)  discuss  second-order  composite  designs,  Myers,  Khurl,  and 
Carter  (1989)  discuss  recent  and  current  response  surface  methodology 
research. 


115 


116 


Table  1.  Three- variable  Composite  Design 


Run  # 

Xl 

X2 

X3 

1 

0 

0 

0 

2 

2 

0 

0 

3 

0 

2 

0 

4 

2 

2 

0 

5 

0 

0 

2 

6 

2 

0 

2 

7 

0 

2 

2 

8 

2 

2 

2 

9 

2 

1 

1 

10 

0 

1 

1 

11 

1 

0 

1 

12 

1 

2 

1 

13 

1 

1 

0 

14 

1 

1 

2 

15 

1 

1 

1 

Corners 


Star 


Center 


A  three-variable  central  composite  design  Is  given  In  Table  2.  The 
literature  on  central  composite  designs  discusses  determining  the  value  of  a 
to  yield  orthogonal  designs.  The  value  of  a  Is  the  length  of  the  axial 
points  shown  In  Figure  3. 
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Figure  2.  Composite  Design 
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Table  2.  Three-variable  Central  Composite  Design 

j  ;3  rtcceral 

^  Axial 
Center 

3.  DESIGN  EFFICIENCY.  Myers  (1971)  discusses  the  efficiency  of  central 
composite  designs  (ccd)  and  shows  that  a  three  variable  orthogonal  ccd  Is  as 
efficient  as  a  33  factorial  design  for  estimating  the  mixed  quadratic 
coefficients.  The  results,  however,  apply  to  only  orthogonal  ccd  and  do  not 
apply  to  the  restrained  composite  design  In  Table  1. 

Because  no  Information  could  be  found  on  the  efficiency  of  the 
restrained  composite  design,  a  cursory  evaluation  was  made  of  the  design. 
ACED,  Algorithms  for  the  Construction  of  Experimental  Designs,  developed  by 
Welch  (1985)  was  used  for  the  evaluation.  Welch  (1984)  generalizes 
Mitchell's  DETMAX  algorithm  and  discusses  ACED.  ACED  has  four  optimality 
criteria,  D  Optimality  (DO),  Average  Variance  of  the  Response  Estimates  (AV), 
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Figure  3.  Central  Composite  Design 

Maximum  Variance  of  the  Response  Estimators  (MV),  and  Average  Mean  Squared 
Error  of  the  Response  Estimators  (AM),  AM  was  selected  as  the  evaluation 
criterion  because  It  provided  a  robust  balance  between  variance  and  bias. 
The  AM  criterion  Is  discussed  In  Welch  (1983), 


The  variances  of  the  parameters  estimates  (bs)  of  the  second-order*  model 

are: 


V(bo)  -  12.0 
V(bl)  -  28.6 
V(bn)  -  5.8 
V(bij)  -  1.9 


1  zo 


The  variance  efficiency  Is  99. 6*  and  the  bias  efficiency  Is  91. 6%. 


Since  these  efficiencies  were  considered  acceptable  and  time  constraints 
precluded  further  evaluation  or  design  development,  the  composite  design  In 
Table  1  was  employed. 

4.  APPLICATION.  The  model  was  exercised  for  each  sensor  type  for  each 
of  the  15  design  points  In  Table  1.  Several  output  variables  were  extracted 
and  analyzed.  Testing  was  performed  at  the  0.05-level  of  significance.  One 
data  set,  Red  personnel  losses.  Is  shown  In  Table  3.  The  significant  model 
was  considered  to  be: 


7=941.2  +  1771.1^  +  483,8^3-9,5^-198.3^X3 

The  unadjusted  R2  was  0.90.  The  residuals  (yi-fl)  ranged  from  -743  to  568. 
The  observed  and  the  predicted  values  are  shown  In  Figure  4.  The  confidence 
Intervals  on  Y  ranged  from  +481  to  +701. 
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Table  3.  Red  Personnel  Losses  with  Sensor  A 


Run  # 

Xl 

X2 

X3 

y 

1 

0.1 

0 

5 

1471 

2 

0.9 

0 

5 

2333 

3 

0.1 

10 

5 

919 

4 

0.9 

10 

5 

1615 

5 

0.1 

0 

25 

4313 

6 

0.9 

0 

25 

2596 

7 

0.1 

10 

25 

5159 

8 

0.9 

10 

25 

2153 

9 

— 

5 

15 

2670 

10 

0.1 

5 

15 

4201 

11 

0 

15 

4038 

12 

■B 

10 

15 

2835 

13 

HOB 

5 

5 

1823 

14  - 

0.5 

5 

.  25 

3858 

15 

0.5 

5 

15 

4146 

The  analysis  results  of  this  output  variable  Is  shown  only  to  Illustrate 
application  of  the  composite  design,  not  to  Illustrate  goodness  of  the  final 


Figure  4.  Observed  and  Predicted  (y,Y)  Values 
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5.  SUMMARY .  The  12-point  Box-Behnken  design  which  Is  the  complement  of 
the  15-point  composite  design  used  was  not  considered.  It  may  have  provided 
a  more  efficient  design.  Also  not  considered  was  shortening  the  six  axial 
points  to  give  five  levels  for  each  of  the  variables.  This  may,  too,  have 
been  a  superior  design  to  the  design  employed.  The  15-point  composite  design 
employed  was  considered  to  be  appropriate  for  the  purpose  of  evaluating  a 
second-order  model. 
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Abstract 

Distribution  theory  is  developed  for  diagnostics  used  to  investigate 
variance  component  estimates  and  model  assumptions  in  mixed  or  random 
models.  Estimation  of  variance  components  in  a  given  model  is  the 
equivalent  of  estimation  of  certain  linear  functions  thereof.  Each 
such  linear  function  is  realized  as  an  average  of  natural  sample 
covariances,  that  may  be  independent  or  correlated.  The  distribution 
of  the  set  of  these  sample  covariances  is  developed  in  both  cases, 
thereby  giving  a  formal  basis  for  a  diagnostic  procedure  that  has  been 
used  to  identify  sources  of  negative  variance  component  estimates  and 
to  reveal  model  deficiencies.  This  mixed  or  random  analog  of  residual 
analysis,  complete  with  diagnostic  tools,  is  presented.  This  involves, 
in  part,  a  re-examination  of  the  model  for  mixed  or  random  effects. 

The  distribution  applies  to  any  random  or  mixed  model  and  is 
illustrated  here  in  actual  repeated  measures  experiments  and  validated 
by  simulations. 
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1.1  Introduction 


The  problem  of  estimating  variance  components  is  the  equivalent  of 

the  problem  of  estimating  the  covariance,  between  appropriately 
related  observations.  As  alluded  to  in  Hocking  (1989),  the  estimate 
is  an  average  of  sample  covariances,  individually  referred  to  herein 

as  diagnostics,  or  is  a  simple  linear  function  of  such  averages. 

Therefore,  the  development  of  the  distribution  theory  for  the  variance 
component  diagnostics  will  focus  on  the  development  of  the  distribution 
of  the  sample  covariances.  It  will  be  useful  to  consider  these 
as  bilinear  forms.  For  example,  consider  a  three-factor  factorial 
experiment  with  factor  1  random  and  factors  2  and  3  fixed.  To  estimate 
-  <t>y  a  sample  covariance  of  the  form 

C  -  l/(arl)  x  ^(Yljk.  -  ?.jk.)(?ij*k*.  -  ?.JV.) 

is  used,  inwhich  M*  and  k^k*.  This  sample  covariance  can  be  written 
written  as  a  bilinear  form 

l/(n)(Zj*AZa),  with  -  (tijk.),,  Zj’  -  ?(irk*.)„ 

A  -  I«1  "  V*7ai  “d  n  "  •»  "  1 

Equivalently,  the  bilinear  form  can  be  written 
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l/(2n)  [V, Z,'  ]  [J  $  ]  [1;  ]. 

Except  for  rearranging  indices,  the  bilinear  form  associated  with  any 
diagnostic  can  be  written  as  (1.1),  For  simplicity,  the  examples 
discussed  assume  a  three-factor  model,  but  the  methodology  is  general. 

If  a  nonfactorial  model  is  assumed,  still  with  only  one 
random  factor  and  it  is  not  nested,  then  a  sample  covariance  of  the 
form  C  is  still  appropriate.  However,  depending  on  the  nesting,  one 
of  the  conditions  j  *  /,  k  +  k’  might  be  relaxed.  In  the  case 
of  four  or  more  factors,  the  same  results  hold,  so  long  as  there  is 
only  one  random  factor  (other  than  replication)  and  it  is  not  nested. 

The  distribution  of  Z^AZ,  depends  on  the  covariance 
structure  of  (Z^.Z,').  There  are  two  cases  to  consider.  If  there  is 
only  one  random  factor,  such  as  factor  1,  then 
(Z^Zj’)  <v  N  (p,  V).  inwhich  u’  ■  (Hx\  /*,’).  and 


with  each  of  a  and  c  being  a  simple  linear  function  of  the  variance 
components. 

If  factor  1  is  not  the  only  random  factor,  V  may  be  more 
complex  and  the  diagnostics  are  non-independent  paired  observations. 
This  case  will  be  discussed  in  section  two, 


(1.1) 
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The  first  explicit  density  function  for  the  bilinear  form, 

C  -  gi(tijk.-t.jk.)(?ijV.-t.jV.)/n, 

with  n  ■  (aj-1)  was  developed  by  Pearson,  Jeffery,  and  Elderton  in  1929 
based  on  independent  sample  pairs  (Yljk.,YijV.)  having  a 
bivariate  normal  distribution  with  the  variance-covariance  structure 
of  V*  below.  In  summary,  they  used  the  result  that  if  Yijk.  and 
YijV,  are  jointly  normally  distributed  random  variables,  with 
expected  values  nx  and  m2,  respectively,  and  covariance 


then  the  conditional  distribution  of  Yijk.,  given  YijV,,  is  normal  with 
expected  value  (Yij V.-pa)  and  standard  deviation  a(l-/^)^8, 

Thus,  the  conditional  distribution  of  C,  given  the  a2  vector  (YijV.) 
is  normal  with  expected  value  (p)S  and  standard  deviation  (a(  1-/7*)S) 1/a, 


where  , 

S-g  (YijV.  -  Y.jV.)a. 

As  S  is  distributed  (a)xa,  ,,,  the  probability  density  function  of  C  is 


f(c) 


f°S(n'4)/W-S/(2a)) 


(2ir)  1/,al^(l-^)1/at(2a)(n“8)/8((n-l)/2)! 


x 


exp 


(nc-(A>1/aS)/a)a 

2a(l-^)S 


(1.2) 
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Various  other  methods  of  deriving  the  distribution  have  been 
demonstrated  by  Wishart  and  Bartlett  (1932),  Hirschfeld  (1937),  and 
Mahalanobis,  Bose,  and  Roy  (1937), 

Press  (1967)  presented  some  other  equivalent  forms  of  the 
density  (1,3) .  Defining  a  sample  of  N  independent  observations 
(Zu,  Zy,),...,  (Zfjj,  ZNJ)  from  Nj(m,  V*),  he  found  that  for  n  -  N-l, 


f(c) 


n  (l3  -  r3)n^(nc)^n*>)/Vriie 
jr1/,(2^(n*1)/Jr(n/2) 


K(n-i)/s^nc^’ 


0.3) 


inwhich  Ka(z)  denotes  a  modified  Bessel  function,  $  *  7  b, 
where  7  and  7  are  functions  of  p  and  the  common  variance  (a)  of 
the  Z’s,  and  are  equal  to  7  ■  t»( *-P2) l"1,*?  ■  (aO-Pj)]'1. 
r  -  pfi,  and  p  ■  c/a  inwhich  c  is  the  covariance  of  the  Z’s  and  a 
the  variance.  When  a  is  an  integer,  the  Bessel  function  is  referred 
to  as  a  modified  Bessel  function,  and  when  a  is  an  odd  half-integer,  it 
is  referred  to  as  a  modified  spherical  Bessel  function  of  fractional  order 
or  a  Bessel  function  of  the  third  kind.  When  the  number  of  degrees  of 
freedom  n  is  even,  it  is  possible  to  express  the  density  of  C  in  terms 
of  elementary  functions  and  to  calculate  the  exact  expression  since 

k,  aV3(Z)  -  (»/(2z))l/V*  - j 

(n‘a)/J  H)  IXjti)  IX(n-5)/2-j)  (toy 


Press  (1967)  provided  formulae  for  computing  the  exact 
cumulative  distribution  function  of  the  sample  covariance  for  an  even 
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number  of  degrees  of  freedom.  In  addition,  percentage  points  of  the 
C  distribution  for  seven  values  of  n  and  p  »  0  were  tabulated. 

However,  for  an  arbitrary  sample  size,  Press  states  that  the 
probability  density  function  of  C  "is  a  complicated  expression  which 
is  difficult  to  evaluate."  To  evaluate  the  probability  density 
function,  it  was  necessary  to  develop  an  efficient  formula  for 
calculating  the  distribution  function  of  the  covariance  utilizing  the 
recursive  properties  of  the  Bessel  function. 

1.3  Distribution  of  the  Sample  Covariance  for  all  Sample  Sizes 
In  developing  the  computational  formula  of  the  distribution, 
two  cases  had  to  be  considered.  For  the  first  case,  N  is  even,  and 
(N-2)/2  is  an  integer.  The  second  case  is  that  N  is  odd.  Thus,  the 
calculation  of  the  probability  density  function  requires  calculation 
of  the  modified  Bessel  function  for  both  integer  and  fractional  order. 

The  computation  of  the  modified  Bessel  function  of  integer 
order  requires  two  polynomial  approximations  for  order  0  and  1,  which 
will  be  referred  to  in  this  paper  as  k,j(y)  and  kx(y),  respectively.  These 

approximations  are  precise  to  at  least  lxlO'8.  The  approximations  are 
defined  in  Abramowitz  and  Stegun  (,  1964).  From  k^y)  and  kx(y)  and 
results  in  Abramowitz  and  Stegun  (1964),  the  following  recursive 
formul  j  may  be  developed; 
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*Wy>  "  (2n/y)  W  +  (n-l)<y>* 

For  example,  ^(y)  -  (2/y)  kA(y)  +  ko(y), 


(1.4) 


and  kj(y) 


<4)yy) 


+  My). 


The  above  formula  (1.4)  is  useful  in  calculating  the 
values  of  the  (n-l)/2  order  Bessel  function.  To  determine  the  value 
of  the  Bessel  function  for  fractional  order  the  following  relationship 
found  in  Abramowitz  and  Stegun  (1964)  was  used: 

lb  (ididtw,)1 } 


Given  the  values  of  the  Bessel  function  for  a  fixed  n,  the  probability 
density  function  of  the  distribution  (1.3)  was  easily  evaluated. 


1.4  Calculation  of  CDF 

The  cumulative  distribution  function  was  computed  using 

Simpson’s  integration  method.  Simpson’s  method  of  numerical 
integration  approximates  the  probability  density  function  by 

a  set  of  parabolas.  In  general,  Simpson’s  rule  gives 


where  Ax  «  (b-a)/n,  fj  -  f(a+jAx). 
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1.5  Tabulated  Cumulative  Distribution  for  the  Diagnostics 

Critical  percentile  points  of  the  covariance  distribution  for 
p  ranging  between  -0.9  to  0.9  in  increments  of  0.1,  with  the  sample 
size  N  between  2  to  10,  15,  20,  25,  30,  40  and  50,  and  the  variances 
equal  to  one  are  contained  in  Grynovicki  (1989).  Specifically,  this 
paper  gives  the  value  of  CerU  such  that  P[C  s  Ccrlt]  -  a,  for  a  -  0.01, 
0.05,  0.10,  0.90,  0.95,  and  0.99  inwhich  C  is  the  sample  covariance 
from  a  bivariate  normal  with  mean  0  and  indicated  variance-covariance 
matrix  V. 

1.6  CDF  Program  for  Diagnostics 

A  computer  program  to  calculate  the  cumulative  distribution  of 
the  sample  covariance  (C/(N-1))  or  equivalently  the  variance  component 
diagnostics  is  presented  in  Grynovicki  (1989).  The  program  is  written 
in  Turbo-Pascal  Version  4.0®,  see  Miller  (1987),  and  can  be  compiled 
and  run  on  any  IBM-compatible  or  Macintosh  personal  computer  provided 
Tufbo-Pascal  4.0  is  available.  The  program  utilizes  Simpson's 
integration  method  and  calculates  the  cdf  using  a  tolerance  of  10'8. 
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1.7  Validation  of  Distribution 

For  fi  ■  -0.9  to  0.9,  in  increments  of  0.1  and  for  sample  size 
N  -  2  to  10,  and  50,  a  random  sample  of  1,000  sample  covariances  from 
a  bivariate  normal  were  generated  as  follow.  First,  three  sets  of 
1,000  independent  standard  normal  variates  (Ylt  Ya,  Y3)  were 
generated  using  the  Box-Muller  transform.  Second,  l,000xN  independent 
samples  from  a  bivariate  normal  distribution  were  generated  with 
specified  variances  and  a  covariance  using  the  transformation 
Z\  -  ox  (sin  (Aj)  Yj  +  cos  (Aa)  Ya )  and 
1*2  -  *a  (sin  (Aa)  Ya  +  cos  (Aa)  Ya ) 

in  which 

Aj  -  arccosKley/^era))17’],  and 
Aa  ■  Aj  if  ffia  fc  0, 

-  *  -  Aj  if  <ru  <  0. 

Finally,  the  1,000  covariances  were  calculated  by  sequentially 
selecting  1,000  pairs  (Z*j,  Z*3)  of  N-vectors  and  calculating  the 
covariance  Z*j’AZ*a,  where  A  ■  I-JJ'/N,  I  is  NxN  identity  matrix, 
and  J  is  a  N  column  vector  of  l’s. 

As  a  partial  check  of  the  density  function,  a  comparison  of 
the  simulation  and  actual  distribution  was  made  using  the  Kolmogorov- 
Smirnov  one-sample  goodness -cf-fit  test.  The  test  statistic  is 
D  -  maximum  |F(x)  -  S(x)  |,  -oo  <  x  <  oo, 

! 
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inwhich  F  and  S  are  the  theoretical  and  simulated  distribution 
functions,  respectively.  For  a  sample  size  of  1,000,  the  critical 
value  of  this  statistic  is  0.043  at  a  -  0.05. 

Comparison  of  the  theoretical  and  simulated  values  was  made 
for  values  of  N  from  2  to  10  and  50  for  values  of  p  in  increments  of 
0.1,  between  >0.9  and  0.9,  and  for  variances  equal  to  one.  Two  SAS 
computer  programs  were  written  to  generate  the  simulated  value  and  to 
calculate  the  Kolmogorov-Smirnov  maximum  deviation  statistic.  These 
programs  are  contained  in  Grynovicki  (1989). 

The  calculated  D  for  the  specified  parameters  can  be  found  in 
Grynovicki  (1989).  All  190  simulations  were  determined  to  have  a 
calculate  D  below  0.043.  Thus,  the  simulated  distribution  is 
consistent  with  the  one  derived  when  compared  at  the  0.05  probability 
level. 


It  is  worth  noting  that  the  maximum  deviations  occurred  at  the 
center  of  the  distribution  and  not  at  the  tails. 

1 . 8  Validation  of  Distribution  for  Diagnostic  Tables 

1.8.1  imrwlvwtton, 

Once  the  distribution  for  independent  diagnostics  was 
developed  and  validated,  the  next  step  was  to  determine  if  the 
distribution  could  be  used  in  evaluating  a  table  of  diagnostics  that 
are  correlated.  Searle  (1971b)  has  shown  that  the  correlation  of  two 
bilinear  forms  is  equal  to 
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Cov(Z^AuZj,  Z,sA8<Z4)  ■  tr(  AuCjjA84C41  +  A^Cj^Aj^Cjj), 
in  which  E(ZX)  -  E(Z,)  -  E(Z,)  -  E(Z4)  -  0, 
where  Cuv  ■  Cov(Zu,  Zy),  if  u  *  v  and 
-  Var^,  Zy),  if  u  -  v. 

Also  define  T  -  [Zx\  Z,’,  Zj’,  Z4’],  so  that  Z~  N(0,V),lnwhich 


r 

C11 

cu 

c« 

Cm 

C21 

CSI 

cM 

c84 

C31 

^ss 

CS3 

C$4 

< 

C41 

c4a 

C48 

C44 

J 

To  determine  how  well  the  derived  distribution  fits  correlated 
diagnostics,  an  experiment  will  be  simulated  at  least  200  times  and  the 
calculated  diagnostics  will  be  compared  with  the  theoretical 
distribution.  For  simplicity,  I  will  consider  a  3-way  factorial 
experiment  with  factor  1  random  and  factors  2  and  3  fixed.  In  this 
simulation,  0lt  the  covariances  of  the  form 

q  -  l/(arl)^(?ij1k1.-?.j1kJ.)(?ij3ka.-?.jaka.)  -  l/(arl)  Z{AiaZa, 
in  which  ^  *  ja  and  kj  *  k,  will  be  the  diagnostics  used.  Also  define 

C,  -  l/CvD^Ti^kj.-Y.Jjkj.JCtij^.-?.^.)  -  l/(arl) 

For  this  experiment  if  we  let 
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cwd  j  -  #  {jr  jj}  n  {jj,  j4), 

card  k  -  #  (k,t  k,)  n  (k,,  k4),  and 

card  jk  -  #  ((j^  kx),  (jy,  k,))  n  {(jj,  k,),  (J4,  k4)), 

than  the  covariance  of  atty  two  of  the  diagnostics  for  9X  is 

cov  (ClfCj)  ■  26*  if  card  j  ■  card  k  ■  0, 

0*  +  6X  if  card  j  -  1,  card  k  ■  0, 

9*  +  ex  018  if  card  j  ■  0,  card  k  ■  1, 

9*  +  9X  9m  if  card  j  ■  card  k  -  card  jk  ■  1, 

*i*ia  +  h  hi  card  J  -  card  k  -  1,  card  jk  -  0, 

9*  +  Miss  if  card  j  ■  1,  card  k  -  2,  card  jk  -  1, 

6*  +  Miss  if  card  j  ■  2,  card  k  ■  card  jk  ■  1. 

Also,  the  var  (C^  -  9X  +  tfiaaa. 

Other  experimental  designs  are  entirely  analogous.  If  a 
nonfactorial  model  is  assumed  with  only  one  non-nested  random 
factor,  a  sample  covariance  of  the  form  C  is  still  appropriate 
although,  depending  on  the  nesting,  one  of  the  conditions,  jt  #  Ja*, 
kj  *  kj*  might  be  relaxed.  The  variance-covariance  matrix  V 
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still  has  the  form  assumed  even  though  the  variance  and  covariance  may 
be  different  functions  of  the  variance  components. 

1. 8. 2  Simulation  of  a  Three-Factor  Factorial  Experiment 


The  linear  model  used  in  this  simulation  was 

Yijkt  -  M  +  Ai  +  ABij  +  ACik  +  ABCijk  +  <ijkt). 

Here,  M  represents  the  grand  mean  and  all  fixed  effects,  and  the 
remaining  terms  are  independent  distributed  normal  with  mean  zero  and 
variance  given  by  the  associated  variance  component.  The  structure  of 
the  covariance  matrix  for  this  design  as  defined  in  Hocking  (1983)  is 

V  ■  ^i(Ag  +  Aj)  +  Au  (Aj  +  Au)  (Ag  +  A1#)  +  Agg  (Agg  +  AUg), 
where  A*  -  ( l/a%)  Qx  ®  O,  ®...Gk  «  J^’,  a*,  -  n^a,, 

Gi  -  \  If  1  ^  T  or  if  i  *  T, 

and  At  are  the  eigenvalues  of  Y. 

For  this  model,  the  variance  for  Z|  associated  with  the 
terms  ccmprlslng  the  bilinear  form  has  variance  Var^)  -  ^  + 

<t>n  +  013  +  ^13J.  Its  covariance  is  cov(Z1ZJ)  -  <f>v 

Two  cases  of  this  design  were  considered.  For  the  first  case 
a*  ■  3,  aj  -  3  and  as  -  2.  In  the  second  case,  at  -  3,  a2  -  3, 
and  Bg  ■  4.  In  the  first  case,  500  8j  x  a9  x  a3  independent  sample 
from  a  standard  normal  distribution  were  generated  and  in  the  second 
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cue,  200  x  <4  x  a,  were  generated.  Both  used  Box-Muller. 

Then,  a  sample  of  size  a1  x  x  a,  wu  sequentially  selected 

and  multiplied  by  V1^  where  V^1  is  the  same  u  formula 
l.S  except  that  the  eigenvalues  are  replaced  by  its  square  root. 

For  cue  one,  6  diagnostics  were  generated  per  iteration  and 
in  the  second  cue  36  diagnostics  were  generated  giving  3,000 
diagnostics  for  cue  one  and  7,200  diagnostics  for  case  two.  The 
value  of  the  variance  component!)  wu  varied  to  obtain  values  of  p 
between  -0.4  and  0.8,  Due  to  the  positive  definiteness  of  the 
variance  covariance  matrix  V,  -0,4  wu  the  smallest  value  one  could 
expect  from  this  design.  The  results  for  both  cues  are  shown  in 
Table  1.1.  For  cue  one,  the  maximum  difference  for  the  simulation 
and  theoretical  distribution  ranged  between  0.037  and  0.11.  However, 
for  the  critical  probabilities  of  .01,  .03,  and  .1,  the  estimated 
critical  values  were  small  and  conservative.  The  P(C  i  Ccrli) 
wu  always  larger  than  what  the  simulation  showed.  The  difference  in 
the  agreement  between  the  theoretical  and  simulation  increased  as  one 
increued  in  probability  from  0.01  to  0,10,  The  maximum  difference  in 
the  two  distributions  occurred  in  the  center  of  the  distribution.  For 
tho  high  critical  values  in  cue  one  and  all  critical  values  in  cue 
two,  the  simulation  and  theoretical  distribution  agreed.  The  maximum 
deviation  between  the  theoretical  and  simulation  ranged  between  .009 
and  ,017  for  cue  two.  As  in  cue  one,  the  estimated  critical  values 
were  conservative. 
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TABLE  1.1 

Calculated  Kolmogorov-Smirnov  One  Sample  Statistic, 
D,  and  Probability  Differences  at  Critical  Values  for 
Simulated  and  Theoretical  Distribution  of  Variance 
Component  Diagnostics  for  Various  Values  of  p 
when  Variances  are  Equal 
«a  ■  3  as  -  2 

Difference  at  Critical  Probabilities 
a 


p 

D 

.01 

.05 

.10 

.90 

.95 

.99 

-0.43 

0.094 

0.007 

0.036 

0.059 

0.027 

0.021 

0.009 

-0.21 

0.110 

0.008 

0.040 

0.071 

0.016 

0.016 

0.007 

-0.09 

0.079 

0.009 

0.035 

0.069 

0.011 

0.010 

0.005 

0.04 

0.081 

0.007 

0.038 

0.062 

0.002 

0.007 

0.004 

0.13 

0.096 

0.009 

0.039 

0.071 

0.009 

0.004 

0.003 

0.25 

0.061 

0.001 

0.038 

0.057 

0.000 

0.006 

0.000 

0.41 

0.064 

0.009 

0.036 

0.055 

0.007 

0,004 

0,002 

0.61 

0.072 

0.000 

0.033 

0.058 

0.006 

0.002 

0.001 

0.82 

0.037 

0.009 

0.029 

a  022 

0.002 

0.009 

0.001 

aa  ■  3 

a#  ■  4 

-0.43 

0.017 

0.001 

0,003 

0.002 

0.007 

0.003 

0.002 

-0.21 

0.012 

0.000 

0.001 

0,000 

0.009 

0.001 

0.001 

-0.09 

0.013 

0.001 

O.COl 

0.002 

0.008 

0.002 

0.001 

0.04 

0.015 

0.002 

0.001 

0.004 

0,008 

0.003 

'  0.002 

0.13 

0.021 

0.000 

0.000 

0.001 

0.009 

0.005 

0.000 

0.25 

0.013 

0.002 

0.001 

0.003 

0.006 

0.003 

0.001 

0.41 

0.010 

0.003 

0.002 

0.005 

0.008 

0.003 

0,000 

0.61 

0.009 

0.002 

0.003 

0.007 

0.006 

0.004 

0.001 

0.82 

0.014 

0.002 

0.008 

0.011 

0.005 

0.006 

0.006 
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Based  on  these  findings,  one  can  use  the  table  of  diagnostics 
to  identify  abnormally  large  or  small  covariances  in  the  table.  This 
diagnostic  method  will  allow  researchers  the  tool  to  investigate 
sources  of  negative  variance  component  estimates,  identify  outliers 
and  reveal  model  deficiencies. 

Having  developed  the  distribution  of  the  diagnostics  for 
bilinear  form  when  the  sample  is  from  a  set  of  independent 
observations  distributed  Na(/i,  V),  the  next  step  is  to  develop  the 
distribution  for  the  diagnostics  (covariance)  in  which  the  assumption 
of  independent  paired  observations  does  not  hold.  The  development  of 

this  distribution  and  its  validation  is  presented  below, 

2.1  Distribution  Theory  for  the  Variance  Component  Diagnostic 
for  Non-Independent  Paired  Observations 

The  final  phase  in  developing  the  distribution  theory  for  the 
variance  components  was  to  consider  the  case  where  the  sample 
pairs  (Z^,  Z8j);(j  -  1,  2,  ....  aj);  are  from  a  bivariate  normal 

distribution  with  variance-covariance  structure 


V 


al  +  WJ* 
cl  ♦  dJJ* 


cl  +  dJJ* 
al  +  bJJ’  j 


The  small  letters  represent  linear  combinations  of  the  variance 

components  as  specified  by  the  linear  model,  I  is  an  identity  matrix, 

and  J  is  a  column  of  ones,  This  circumstance  arises  when  dealing 

with  a  linear  model  of  more  than  one  random  main  effect  and  then  only  In  regard 
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to  certain  variance  components  associated  with  the  interaction. 

The  Representation  Theorem  presented  In  Green  (1987)  allows  the 

diagnostics  for  designs  of  all  sizes  to  be  estimated  in  an 

unbiased  and  efficient  manner,  regardless  of  the  number  of  random 

factors,  or  type  of  nesting.  This  theorem  states  that  complex 

diagnostics  can  be  written  as  a  linear  combination  of  simpler  sample 

covariances.  Each  sample  covariance  is  based  on  the  levels  of  a 

single  factor.  Thus,  the  only  bilinear  forms  required  are  of  the 

type  Z^AZg,  in  which  lx  and  Z,  are  vectors  of  responses  that  vary  the 

levels  of  only  one  factor,  and  A-I-JJ’/aj,  in  which  a,  is  the  number 

of  levels  of  that  one  factor,  Thus,  in  developing  the  distribution  of 

» 

the  diagnostics  for  paired  samples  which  are  not  independent,  and 
having  already  attained  the  distribution  for  the  independent  case,  the 
distribution  of  the  diagnostics  for  any  design  with  at  least  one 

random  factor  will  be  completed, 

2.2  Htimyj  iianifaaoaUaiL 

The  first  step  in  developing  this  distribution  was  to 
determine  a  transformation  that  could  change  the  variance  covariance 
structure  so  that  the  transformed  paired  observations  would  be 
independent  and  have  the  variance-covariance  structure 

v  -  f  aI  cl  1 

v  i.  cl  al  ) 
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Using  all  but  the  first  row  of  the  Helmert  matrix  as  the  matrix  of  the 
transformation,  it  will  be  shown  that  the  bilinear  form 
Zt’AZs  -  x^Xf  *  x^a'Xj  ♦ 

in  which  Xf  ■  WZ,,  W  is  the  Helmert  matrix  excluding  the  first 
row, 

a  -  V<W,  >nd  V  -  (I.,.,  -(I.,., )(!.,.,•)  ) 

The  Helmert  matrix,  H,  is  an  orthonormal  matrix.  The  first 
row  of  H  is  JV(a9)1/*.  For  r  -  2,  ....  aa,  the  r*h  row  of  H  has 

its  first  r-1  components  equal  to  [r(r-l)]'1/*,  the  r‘h  component 
equal  to  *(r-l)/(r(r-l)]^s,  and  the  remaining  components  equal  to  0. 

PROOF  OF  2.1: 

Let  -  (tilk.,  ?i2k. . ?la,k.), 

Z,’  -  (¥ilk’.,  ¥i2k*. . ¥  la2k*. ) , 

zi  *  iyo,  Cll),  and 
Z,  m  byo,  C22),  with 

Cll  -  al  +  bJJ% 

C22  -  al  +  bJJ\  and 

CovfZj.Zj)  -  C12  -  cl  +  dJJ\  in  which  a,  b,  c,  and  d  are  linear 
functions  of  the  variance  components. 


(2.1) 
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The  Helmert  matrix  H  is: 


HP-  [aa“1/,J, 

wa’  ]  -  twa\  Wa’]. 

Also, 

Xj  -  HZ:  - 

1 _ 1 

1 

1 — 1 

w  H 

NN 

1 _ 1 

Xa  -  HZj  -  j 

;  %]■[ 

SI 

Then,  X1'X2«Z1'H'HZ3  -  Zj'Zj,  since  H  is  orthonormal,  and 
X^Xj  -  Zj’Wj’Wft  +  Z^Wj’Wj  Za  -  ajZ^j  +  Z.’Wj’WjZj. 
Rearranging  terms, 

Xij'Xjj  -  Z^Wj’WjZ, 

-  Xj'Xj  -  ajZ,2a 

-  w  -  m 

-  Zj’AZ,. 


Since  A  ■  (I-JJVaj  )  it  follows  that  Xla’X,a  •  Xjj’AXjj  f 
(aa-l  ]  X^Xjj.  Thus,  the  bilineai  form  Zj’AZj  is  equal  to  Xjj’Xjj. 

Now,  the  variance  covariance  structure  of  (X^.Xn  )  is  of  the  form 


V  - 


cl 

al 


) 


since  WaCuWa’  -  al,  WaCaaWa’  -  al,  and  W^W,’  -  cl.  Having 
established  that  the  bilinear  form  Z^AZ^  -  Xia’AXaa  +  (aa-l  )  XuXaa, 
the  next  step  was  to  determine  the  distribution  of  Xla,AXaa  +  (ay-1  )  £uXaa. 
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First,  one  must  realize  that  the  bilinear  form  can  be 

written  as  a  linear  combination  of  central  chi -squares  and 

that  (aj-l  j  Xg&H  can  be  written  as  a  linear  combination  of  chi-squares. 


Specifically,  a  property  of  the  bilinear  form  is  that 


*  a 


- 2 - 


in  which  a  is  the  common  variance  of  Xu  and  XM,  p  is 
the 

correlation  between  Xu  and  Xja,  and  xa  a  ^  is  the  central 
chi-square  with  (Oj-2  )  degrees  of  freedom. 


PROOF: 

Consider  the  product,  XxXa,  of  deviations  from  the  sample 
mean  inwhich  Xx  and  Xj  are  singletons. 

Let  X’  -  (Xj,  Xj).  Then  X  -  (Xx,  Xj)’  m  Na  (0,  V*),  where 


f  m  f  °i  ^ 

IW  J 


If  A 


(„V  l'l  )  thtt  X,X,  .  (X„X,  )(,®J  ]  (  X„X,  )  • 


146 


The  characteristic  function  is  given  by  E  (s'** ) 

"  /  /  rhir ,,xp  litXAX  “  1/2  fr’**  )  Y  fa’**  )  ’  ]  dxa* 

2ajVl ' 

which,  since  C(I-2itAV) V"1 )  ■  V(I~2tiAV)‘\  may  be  written  as 

)  ( (  I-2KAV  )  V*  f  (XA  ) '  ]  dX,  dX,. 

Let  W  •»  [(I-2itAV)V“x  ]'\ 

By  the  identity  2irn/,|W|1/a-/.../  exp  C-1/2X’W1X  ]  dX^.dX,,,  one  obtains, 

-  V1/a  0v(l-2itAV)_1I  f* 

-  (|V-2itAVl|)‘1/8 

i-ti^Cj  -ticr'j 
■  -tic9,  l-ti^ff, 

*  (l-2itpc1cJ  +  ta  (lV  )rjV,9  )  1 


CsOm 

IV  follows  that  X’AX  -  Y‘  [(M*i  "  (MK*  ]  in  which 
K,  and  K,  are  independent  If  Zx*  -  (Yu,Y13,..mY1#j.  x  j 
and  Z,  -  (Yji*  t'aav.Yj^.  j  J  then 

z/’AZj*  *  X^Xj  -  Xj'IX3, 
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where  X1  -  WZj*,  W  is  the  a,-2  rows  of  the  Helmert  matrix,  and 
A  ■  J^.  Then,  the  characteristic  function 

of  Z^AZj*  is 


e(.xp<<).e(TO>,“>“») 

.  ^  [  (..“sqthi^aaqiaL) 

Thus,  the  distribution  of  Z^’AZ,*  is  equivalent  to  the  distribution 
of  l-fp)K1-(  1-p)  Ks],  where  and  K,  are  independent  chi-square 

variables  with  Sg-2  degrees  of  freedom. 

Second,  dne  must  show  that  (aa- 1)  X1  Kt  is  distributed  as  a  linear 
•  combination  of  central  chi-squares.  Specifically,  if  one  defines 

?i  -  (a,-!)17**!  and  ?,  -  (vl)V%.  then  (?lt  ?a)  «  N,  (0,Z),  where 

Z  -  [  I  a  J 

.  and  the  distribution  of  (ag-1)  is  also  that  of  (a-wix^-ia-c)*^. 

PROOF: 

*  -  <  W  -  ^  [  i’  1'  ]  [  xj  ]• 

in  which  X  -  WZ,  as  previously  defined.  Then 
X  *  N.r,  (Q,V),  and  X  w  Na  (0,i/(ag-l)Z),  where 
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Let  ?,  -  (aj-l)1^.  Then  t  m  Na(0,Z)  andt^*  -  (aj-l^Xj. 
Define  a  2X2  Helmert  matrix,  H  ■  ~I/f  [1  -1  ]•  Then 

W  -  (Wj’.W,’)  -  (H?)’  *  [  (f1+?a)/(2)1/s.  (V?*)/(2)i/S  J  and 

Thus  (Tj+tj)/^)1^1  and  (Y1-?a)/(2)1|/l  are  independently  normally 
distributed  with  variance  (a«)  and  (a-c),  respectively.  By  Theorem 
2.3  in  Hocking  (1984), 

Wj’Wj  «  (i+c)x\, 

Wa’Wa  m  (a-c)xait  and 

Wj’Wi  +  Wj’Wj  -  YjYj  -  (Bj-OXjXj. 
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2.4  Distribution  of  Linear  Combinations  of  Weighted  Central 
Chi-Squares 


Define  C  -  T4  -  Ts,  in  which 

T-  -  »i  [  X’^.,  +  4^-x’i  ]•  aacl 
T.  -  bi  [  X1.,.,  ♦  V  **■  ]’ 

*1  ■ 

bt  -  J-KM]. 

ax  >  0,  bx  >  0,  a*  -  (a+cJ/aj  >  1,  b*  ■  (a-o)/bx  >  1,  and 

all  the  chi-squared  variates  are  independent.  The  distribution  of 

T.  can  be  represented  by 

A  « 

F*i(x)  "  fb  qi  F(«a-i)+ai(x/ai)’ 

inwhich  ^  q-1  and  the  q(  are  weight  constants  depending  on  (a+c)/a1  and  a3. 
The  weight  constant  q  is  equal  to 

t 


in  which  T  (1/2)  »  on1/8,  and 


H(r  +  1/2)  -  1»3«5*“^r-1)vfr 


PROOF: 

Let  ^  *  (t)  denote  the  characteristic  function  of  a  central 

*n 

chi-square  with  n  degrees  of  freedom  and  ®p(t)  the  characteristic 
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function  of  Tj.  Then,  ^  *  (t)  »  (l-2it)‘n^a,  and, 

*n 

because  the  chi -squared  variates  are  independent, 

W>  ■  %.1)(,)^(‘)' 

The  characteristic  function  of  a  constant  times  a  central  chi- 
squared  variate  is  given  by  Robbins,  Herbert,  and  Pitman  (1949)  as 

*  V  <*>  “  0-21a*t)'n/a  -  (a*( i-2it)-(a*-l))"B/* 

-  a*‘n/a(l-2it)'n/a(i-(l  -  l/a*)(l-21t)‘1 ) 

-  a*'n/,(l-21t)‘n/,(l-(l  -  l/a*)(l-21t)‘ir/J,  (2.2) 

By  the  binomial  theorem,  we  have  for  a*  >  0, 

a^n/,[l-(l-  l/a")Z]‘n^a  -  2^  Zj  for  |Z|  <  |1-  1/aV1,  (2.3) 

a*  fc  1,  q  fc  0  (j  ■  0,  1,...),  and  £q^  ■  1.  Since 
|l-2it|-1  si  1,  for  all  real  t  it  follows  from  (4.3)  that  for  a*  Js  1, 
(l-21a*trn/a  -  Eqj  (l-2it)'n/a  ‘J  -  £q^  <t). 

Now,  the  characteristic  function  of  Tj/aj  may  be  obtained 
from  (4.2)  and  the  following  defining  identity  for  the  constants  q^’s, 
where  N  -  a,-l. 

[  a*‘N/a  [I-(l-l/a*)ZrN/J  ]  -  Eq^,  (|Z|  s  1). 

It  follows  that 

*(T/il)(t)  “  0-»t)-(N/a)  Ea*  '1,a  Chi-  i/hVi-ao*1  ]”1/a  ], 
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-  Iq.(l-2it)“(N/S4'J), 

j«o  J 

-  f QAj  (0, 

J  ^ (N+aj) 

which  is  the  characteristic  function  of  Ti/a^  Hence,  the  cdf  of 
Ti/aj  by  inversion  is  £q(  FN+ai(t),  where  FN+J1(t)  denotes 
the  cumulative  distribution  function  of  a  central  chi-square  with  N+21 
degrees  of  freedom. 

It  follows  on  setting  X  ■  atT  that  the  cdf  of  Tx  Is  given  by 
£q(  FN+ai(x/ai)- 

Similarly,  the  cdf  of  Ta  is  given  by 

Since  Tj  and  Ta  are  linear  combinations  of  central  chi -squared 

variates,  if  fT  and  fT  denote  the  densities  of  T,  and  Ta 
*12 

respectively,  then  the  pdfs  of  T1  and  Ta  are  given  by 

fTl  ■  ^(q/Bj)  FN+atC x/at),  and 

f*ra  ■  fWaj(w/bi)* 
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PROOF: 


E^N+iA/ai)  '■  Zlfemd*/**. 

By  the  Beppo-Levy  Theorem  (Morrison,  1987),  this 

-  X?q1FN+J,<x/ai)  -  FTl/.<x/Ri)- 
Now,  by  Fubini’s  theorem  (Wheeden  and  Zygmund,  1977), 

F  *  "  ($<1|PN+ii(*/ri)  j  “  n+si( x/»i)  *  ^d^N+ai^^i)1 


2.5  Probability  Penalty  for  Diagnostics 


In  this  seatlon,  the  probability  density  function  for  Tj-Tj  *  C, 
which  is  the  diagnostic  when  the  sample  pairs  are  not  independent, 
will  be  developed.  Let  f(x)  and  g(w)  denote  the  pdfs  of  T*  and  Ta 
respectively,  By  convolution,  the  pdf  of  C  ■  Tx-Ta  is 

h(t)  ■  £f(t+w)  g(w)  dw,  (2.4) 

In  the  previous  section,  we  have  shown  that 

f(x)  -  ^(q/aj)  fN+jt(x/ai),  xhO,  and  (2.5) 

i(w)  -  ^(dj/bj)  fN+j/w/bj),  wfcO.  (2.6) 
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Since  the  series  converge  uniformly,  permitting  interchange  of 
integration  and  summation,  we  may  substitute  (4.5)  and  (4.6)  into 
(4.4),  and  letting  M  -  N,  one  obtains 


h(t)  -  Ww'‘bi>  dw  " 

qq  t(SM+SI+SJ  -  S)/Se‘(‘/»*l) 

5?  2(M+»1+M+lj)/s^(M+ji)/Jbi  (Kt+aj)/ ,rif( M+2i ) / 2)  IX(M+2j)/2)  * 
[C  •'*  *l+b1)/(S*1b1))w  W(M+8J-I)/1  (1+w)(M+si-s)/s  U  dw  (2  7) 

It  is  worth  noting  that  the  Integral  given  below, 


1/IXM+2J/2)  £  a-((»i^1)/(a.1b,))w  w(M+aj-s)/»  (i+w)(M4St.«)/s  dW| 

is  the  confluent  hypergeomeirio  function  and  is  identical  with  the 
function  U(a,  b,  x)  discussed  by  Slater  (1960).  Having  obtained  the 
distribution  of  the  diagnostic,  the  problem  of  how  to  evaluate  it 
remained.  This  required  the  development  of  new  recurrence  relations 
for  the  definite  integral. 


2.6  Distribution  of  Bilinear  Form  from  Non-Independent 


It  has  been  shown  above  that  Z^'AZ,  ■  Xj’Xj  +  (a,-l  )  XjX.,, 
in  which  Xj  is  the  Helmert  transformed  data.  If  a3  -  2,  then  Zj’AZj  ■ 
Xj’Xj  -  (aj-i  )  X^j.  It  has  also  been  shown  that  (a3-l  )  m 
(a+cjx^  -  (a-c)x,1,  where  a  is  the  variance  of  X  and  c  is  the  covariance. 

In  the  linear  model  context,  the  variance  (a)  can 
be  broken  down  into  a  set  of  variance  components  comprising  the 
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covariance  (#,  as  well  as  a  set  that  is  not  contained  in  the 
covariance  (a).  Therefore,  defining  the  variance  as  a  »  a +0 

and  the  covaraince  as  b  •  P,  the  distribution  of 

Zt’A Zj  -  X^;,  «  [(i,  X\  ]. 

Therefore,  for  N  ■  2,  the  distribution  of  the  bilinear  form  is  the 
distribution  of  the  covariance  from  Independent  paired  observations 
with  twice  the  estimated  variance. 

2.7  Development  of  New  Confluent  Hypergeometric  Recurrence  Relations 

2. 7. 1  Relation  of  Hypergeometric  and  Bessel  Function 

The  calculation  of  the  cdf  for  the  bilinear  form  when  the 
sample  pairs  are  not  independent  required  the  development  of  new 
recurrence  relations  for  the  confluent  hypergeometric  function.  In 
the  notation  of  Abramowltz  and  Stegun  (1964),  equations  13.1.10  and 
13.2.5,  U(a,  b,  x)  is  the  confluent  hypergeometric  function  of  Kummer 
and  is  given  by 

U(a,  b,  x)  -  1/IT  a)  Jje*«»  t^l+t)*”-1  dt. 

Abramowltz  and  Stegvn  give  two  special  cases  for  which 
U(a,  b,  x)  can  be  written  in  terms  of  the  modified  Bessel  functions. 

Using  these  relationships,  initial  values  of  the  confluent 
hypergeometric  function  for  the  cdf  were  obtained  as  follows. 
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For  the  case  N  is  odd  and  i  »  j,  let  r  -  (N  -  l)/2  +  i,  and 
x  m  2*.  Then  2r  1  -  N  +  2i  -  N  +  i  +  j  and  r  +  1/2  ■  N/2  +  i  ■ 
N/2  +  j.  Using  Abramowitz  and  Stegun  equation  13.6,21, 


U  (N/2  +  j,  N  +  i  +  j,  x)  -  U  (r  +  i/2,  2r  +  1,  2z) 

-  *',/1  «■  (2z)"r  K,(i) 

-  K(H.ltll)/J(x/2). 

For  the  case  N  is  even  and  1  -  j,  let  r  -  (N  -  2)/2  +  i,  Then 
r  +  1  -  N/2  +  iand2r  +  2-  N  +  2i*N  +  i+  j.  Using  Abramowitz  and 
Stegun  equation  13.6.24, 

U  (N/2  +  j,  N  +  1  +  j,  x)  -  U  (r  +  1,  2r  +  2,  2z) 

■  «,/’  Vwll)/1(x/2). 

Note  that  this  expression  is  identical  to  the  one  obtained  for  odd  N. 

Now  by  choosing  i  ■  j  «  0  and  i  -  j  ■  1  with  a  ■  N/2  and  b  ■  N  one  is 
now  able  to  calculate  two  values  for  the  confluent  hypergeometric 
function  for  a  given  value  of  x.  Specifically, 

U  (N/2,  N,  x)  -  U  (a,  b,  x)  and 
U  (N/2  +  I,  N  +  2,  x)  -  U  (a  +  1,  b  +  2,  x). 
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From  these  two  starting  values,  a  recurrence  relation  is  needed  to 
obtain  the  remaining  cases  involved  in  calculating  the  probability 
density  function. 


2. 7. 2  New  Recurrence  Relations  for  Confluent  Hypergeometric  Functions 

The  evaluation  of  the  pdf  depended  on  being  able  to  calculate 
U(a,  b  +  1,  x)  and  U(a  +  1,  b  +  1,  x).  From  Abramowitz  and  Stegun 
equations  13.4.16,  13.4.18,  and  13.4.19,  replacing  a  with  a  +  1  and  b 
with  b  +  1  in  13.4.16  and  13.4.18,  one  obtains 
(a+x)  U(a,  b,  x)  -  xU(a,  b+1,  x)  +  a(b-a-l)U(a+l,  b,  x)  -  0,  (2.7.1) 

(b-a-l)U(a,  b-1,  x)  +  (l-b-x)U(a,  b,  x)  +  xU(a,  b+1,  x)  -  0,  (2.7.2) 

and  (b-a)U(a,  b,  x)  +  U(a-1,  b,  x)  -  xU(a,  b+1,  x)  -  0.  (2.7.3) 

From  these,  if  follows  that 

(b  -  a)(b  -  a  -  l)U(a  +  1,  b,  x)  +  (b  +  x)U(a,  b  +  1,  x) 

-  x  (a  +  x)U(a  +  1,  b  +  2,  x).  (2.7.4) 

Now,  4.7.1  and  4,7.4  are  two  equations  in  the  two  unknowns 
U(a  +  1,  b,  x)  and  U(a,  b  +  1,  x)  and  the  known  quantities  U(a,  b,  x) 
and  U(a  +  1,  b  +  2,  x),  The  solutions  by  Cramer's  rule  are 

uu.i,  b,  *)  -  liS.1 and 

U(a,  b+1,  x)  -  (“Ma+l,  b+2>  «)  +  (b-a)  U(a,  b,  x) 

From  these,  using  recurrence  relation  13.4.16  in  Abramowitz  and 
Stegun,  with  b  replaced  by  b  +  1,  U(a,  b  +  2,  x)  can  be  calculated  in 
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terms  of  U(a,  h,  x)  and  U(a,  b  +  1,  x).  The  process  can  then  be 
continued  to  calculate  U(a,  b  +  3,  x)  and  all  other  values  of  b  for 
a  specific  a  value.  Similarly  recurrence  relation  13,4.17,  with  a 
replaced  by  a+1,  gives  starting  values  U(a  +  1,  b  +  1,  x)  and 
U(a  +  1,  b  +  2,  x).  Other  entries  are  obtained  for  the  remaining  a+1 
elements  by  using  the  same  recurrence  relation.  These  recurrence 
relations  were  used  iteratively  to  calculate  the  U  functions  for  fixed 
1  and  all  j.  Thus,  the  cdf  can  be  evaluated. 

2.8  Turbo  Program  for  Diagnostics  from  Non-Independent  Observations 

A  computer  program  to  calculate  the  cumulative  distribution  of 
linear  combinations  of  central  chi-squared  variables  or  equivalently, 
the  variance  component  diagnostics  based  on  non-independent  paired 
observations  are  presented  in  Grynovicki  and  Green  (1990).  The 
program  is  written  in  Turbo-Pascal  and  can  be  compiled  and  run  on  any 
IBM  compatible  personal  computer  on  which  Turbo-Pascal  is  available. 

The  program  utilizes  Simpson's  integration  method  and  calculates  the 
cdf  using  a  tolerance  of  0.0000006. 

2.9  Validation  of  the  Distribution  for  the  Diagnostics 

For  p  between  -0.2  to  0.8  the  theoretical  distribution  was 

compared  to  the  diagnostics  for  from  a  three-way  hierarchical 
experiment  with  factor  1  random,  2  nested  in  1  and  3  fixed.  In  this 

situation  the  paired  observations  comprising  the  bilinear  form 
are  not  independent.  The  experiment  was  replicated  300  times  for 
each  simulation.  The  diagnostic  has  the  form 


158 


^  (?ijk.-?i.k.)(?ijk.-?i.k.)/(a,  -1). 


Two  cases  were  considered  to  determine  how  well  the  derived 
distribution  fits  correlated  diagnostics  from  the  diagnostic  table. 

For  case  1,  ■  2,  u  5  and  a3  -  3.  For  this  case  there  were 

three  diagnostic,'  per  experiment  for  $&.  Case  2  differed  from 
case  1  in  that  a3  was  increased  to  4.  Both  cases  were  generally 
similar.  The  maximum  difference  for  the  theoretical  distribution  in 
both  cases  ranged  between  0.02  and  0.06,  as  shown  in  Table  2.1. 

The  difference  between  the  theoretical  and  simulated  numbers 
for  the  critical  values  of  0.01,  0.0S,  0.10,  0.90,  0.93,  and  0.99 
ranged  between  0.002  and  0.039,  with  the  maximum  difference  occuring 

in  the  center  of  the  distribution.  The  theoretical  numbers  were 
conservative,  as  in  the  independent  case. 

2. 10  Tabulated  Cumulative  Distribution  for  the  Diagnostics 

Cumulative  percentile  points  of  the  covariance  distribution 
for  p  ranging  between  -0.7  to  0,9  in  increments  of  0.1,  for  sample 
size  N  of  between  3  to  10,  15,  20,  25,  30,  40,  50,  and  for  variance 
equal  to  one  are  contained  in  Grynovicki  (1990).  Due  to  the  restriction  of 
positive  definitness,  this  range  of  parameters  for  p  and  N  should  be 
sufficient  for  most  designs.  Specifically,  this  table  gives  the  value 
of  Cert*  such  that  p(C  3  Cerit)  -  a  for  a  «  0.01,  0.05,  0.10,  0.90, 

0.95,  and  0,99.  C  is  a  bilinear  form  from  a  bivariate  normal  with 

correlated  paired  observations. 
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TABLE  2.1 

Calculated  Kolmogorov-Smirnov  One  Sample  Statistic, 
D,  and  Probability  Differences  at  Critical  Values  for 
Simulation  and  Theoretical  Distribution  of  Variance 
Component  Diagnostics  for  Various  Values  of  p 
from  Non-Independent  Sample  Pairs 
aj  ■  3 

Difference  at  Critical  Probabilities 


a 


p 

D 

.01 

.03 

.10 

.90 

.95 

.99 

-.2 

0.045 

0.004 

0.028 

0.036 

0.037 

0.029 

0.007 

-.1 

0.047 

0.005 

0.021 

0.025 

0.027 

0.022 

0.004 

-.0 

0.035 

0.005 

0.012 

0.032 

0.032 

0.017 

0.003 

-.1 

0.028 

0.007 

0.023 

0.024 

0.020 

0.012 

0.002 

-.2 

0.034 

0.004 

0.016 

0.024 

0.025 

0.020 

0,003 

-.3 

0.041 

0.003 

0.031 

0.035 

0.036 

0.027 

0.006 

-.4 

0.037 

0.006 

0.013 

0.033 

0.034 

0.019 

0.005 

-,5 

0.030 

0.007 

0.021 

0.019 

0.019 

0.013 

0.003 

-.6 

0.027 

0.006 

0.024 

0.021 

0.020 

0.015 

0.004 

-.7 

0.050 

0.002 

0.029 

0.027 

0.028 

0.021 

0.003 

-.8 

0.037 

0.008 

0.013 

0.017 

0.015 

0.009 

0.001 

a3  ■  4 

-.2 

0.043 

0,004 

0.029 

0.041 

0.036 

0.031 

0.009 

-.1 

0.045 

0.003 

0.027 

0.036 

0.030 

0.034 

0.007 

-.0 

0.041 

0.003 

0.033 

0.039 

0.029 

0.015 

0.006 

-.1 

0.058 

0.008 

0.028 

0,038 

0.042 

0.028 

0.005 

-.3 

0.038 

0.005 

0.017 

0.026 

0.027 

0.021 

0.002 

-.4 

0.047 

0.002 

0.020 

0.023 

0.033 

0.024 

0.003 

-.5 

0.034 

0.005 

0.019 

0.021 

0.025 

0.019 

0.004 

-.6 

0.028 

0.009 

0,023 

0.021 

0.019 

0.012 

0.003 

-.7 

0.035 

0.006 

0.008 

0.015 

0.020 

0.011 

0.005 

-.8 

0.048 

0.006 

0.015 

0.021 

0.013 

0.008 

0.001 
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2.11  Illustrated  Example  Using  Eye  Glass  Manufacturing  Experiment 

As  an  illustration  of  the  diagnostic  technique  in  comparison 
with  its  cumulative  distribution,  the  diagnostic  from  an  experiment 
previously  examined  by  Green  (1987)  concerning  eye  glass  manufacturing 
will  be  examined.  The  data  for  this  experiment  are  presented  in  Table 
2.2.  Factor  1  (run)  is  random  at  five  levels,  factor  2  (pot)  is 
random  at  two  levels,  and  is  nested  in  run,  factor  3  (journey)  is 
fixed  at  five  levels,  and  factor  4  (period)  is  fixed  at  three  levels. 
Factors  1,  3,  and  4  are  crossed. 

In  the  previous  analysis,  Green  clearly  determined  that  runs  2 
and  5  were  highly  variable  and  that  pot  2,  in  journeys  2,  4,  and  5  was 
clearly  different  from  the  rest  of  the  data.  The  journey  2,  between 
pot  difference  is  extreme,  and  the  journey  4  and  3,  pot  2  values  were 
from  a  different  type  of  glass  than  all  other  responses. 


Two  diagnostic  tables  will  be  re-evaluated  and  are  given  in 
Tables  2.3  and  2.4.  Table  2.3  represents  the  covariance 
]T(?ij.t.  -  ?i.,t.)(Yij.t.*  -  tl.O/Uj  -  1)  or,  in  Green’s  notation,  C(i,2/tt*). 
The  variance  covariance  structure  of  (Yil.t.,  Yi2.t.,  Yil.t.*,  Yi2.t*.)  is 


V 


al  +  bJ3Jj’ 
cl  +  dJjJ,* 


cl  +  dJ2Ja'  ^ 

,  I,  in  which 

al  +  bJaJa’  ) 


a  "  *12  +  ^US^  +  *134  +  *1334 /5* 
b  -  *1  +  V5  +  *14  +  *134/5’ 
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TABLE  2.2 


1 


2 


Run  3 


4 


5 


Glass  Manufacture  Data 
Pot 


1 

2 

Period 

Period 

1 

2 

3 

1 

2 

3 

47 

56 

100 

52 

61 

88 

55 

89 

93 

49 

62 

97 

35 

57 

56 

34 

60 

72 

78 

67 

113 

47 

93 

118 

33 

40 

128 

16 

29 

130 

52 

66 

36 

65 

80 

40 

21 

61 

49 

122 

97 

79 

31 

39 

25 

45 

54 

72 

43 

72 

52 

109 

120 

80 

37 

51 

67 

67 

85 

63 

50 

61 

60 

75 

139 

130 

33 

27 

49 

46 

58 

63 

24 

39 

24 

15 

33 

39 

18 

18 

43 

22 

16 

19 

28 

42 

28 

27 

19 

22 

24 

34 

43 

46 

66 

24 

24 

49 

42 

40 

117 

105 

21 

21 

51 

30 

28 

34 

21 

69 

48 

36 

64 

53 

76 

48 

42 

39 

60 

78 

31 

54 

40 

19 

93 

36 

34 

24 

46 

16 

12 

2 

120 

122 

120 

33 

58 

107 

109 

119 

120 

25 

63 

90 

69 

49 

60 

34 

43 

30 
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TABLE  2.3 


Diagnostics  /(a,  -  1) 


i 

* 

t 

1 

2 

3 

1 

50.00 

4.00 

-15.00 

t 

2 

0.32 

-1.20 

i  -  1 

3 

4.50 

t$ 

1 

2 

3 

1 

1003.50 

658.60 

470.40 

t 

2 

432.20 

308.70 

i  -  2 

3 

220.50 

* 

t 

1 

2 

3 

1 

20.50 

49.90 

44.20 

t 

2 

121.70 

107.60 

i  *  3 

3 

95.20 

* 

t 

1 

2 

3 

1 

12.50 

57.00 

34.00 

t 

2 

259.90 

155.00 

i  -  4 

3 

92.50 

* 

t 

1 

2 

3 

1 

1113.90 

467.30 

571. 10 

t 

2 

196.00 

239.60 

i  -  5 

3 

292.80 
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TABLE  2.4 


Diagnostics  ’  0 


k*. 

1 

2 

3 

4 

5 

1 

250.7 

207.1 

-247.2 

-61.8 

-136.5 

2 

336.4 

•131.6 

221.3 

-41.6 

3 

497.5 

357.8 

124.8 

4 

620.1 

75.5 

5 

236.3 

k*- 

1 

2 

3 

4 

5 

1 

362.1 

-312.8 

29.1 

-371.0 

-107.4 

2 

799.9 

-488.5 

153.6 

174.1 

3 

631.2 

416.2 

-41.3 

4 

1009.6 

312,6 

5 

229.7 

k"- 

1 

2 

3 

4 

5 

337.1 

763.8 

103.4  t  -  3 
968.3 
1530.2 


1  1012.7  350.0  -330.4  5.5 

2  676.0  -488.6  41.1 

k  3  1032.3  888.3 

4  1286.9 

5 
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C  -  *u  +  4m/5,  and 
d  ■  *1  +  ^13/  5. 

Transforming  the  ?s  with  the  Helmert  matrix  would  result 
in  the  transformed  data  having  a  variance  covariance  matrix  of  V  with  b 
and  d  set  to  zero.  The  variance  of  the  transformed  data  would  be 
*13  +  *13a/5  +  ^1J84/5  -  260,  the  covariance  +  ^m/5  ■  210,  and 
p  is  0.8,  bpsed  on  the  variance  covariance  estimates  given 
in  Green  (1987).  For  these  diagnostics,  since  N  is  2,  double  the 
variances  and  use  the  93%  critical  value  with  N  »  2  and  p  *  0.8.  The 
93%  confidence  interval,  [•‘98,  1705.6],  is  narrower  than  the  3? 
criteria  used  previously.  Due  to  the  large  variance  of  cell  means  for 

this  table,  no  outliers  were  identified.  This  is  consistent  with 
the  previous  results.  The  high  variability  of  runs  2  and  3,  and  low 
variability  of  run  1  is  noticeable. 

For  Table  2.2,  the  variance  of  the  cell  means  is  + 

<W2  +  *13</2  +  ^J34/2  +  *1  +  *13  +  *14  +  *184'  Table  2.4  represents 
the  covariance  C(t,l/kk*)  ** 

£(  ?i.  kt.  .  kt.  )(?i.k*t.  .  k*t.  )/(ar  1) . 

The  variance-covariance  structure  of  the  cell  means  comprising  this 
bilinear  form  is 

MS  S) 

in  which  a  •  ^  +  ^u/2  +  *13  +  *1S8/2  +  *14  +  *114/2  +  *134  +  *1334/2 
C  "  *1  +  *12/2  +  *14  +  *124/2' 
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Using  the  estimates  of  the  variance  components  found  in  Hocking  (1989), 

a  -  701.17,  and  b  -  92.43.  Thus,  the  estimated  correlation  of  the 
independent  paired  cells  of  different  journey  conditions  for  a  given 
period  is  0.132,  Using  the  distribution  theory,  one  can  obtain  an 
estimate  of  the  93%  confidence  interval  [-442,  716].  Based  on  this 
interval,  one  can  see  that  period  2  journey  2  and  3  covariance  is 
small  and  period  3  journey  (3,  4),  (2,  5)  and  (4,  3)  covariances  were 
outside  the  93%  confidence  interval  specified  above.  The  low 
covariance  in  period  2  may  be  due  to  run  3,  pot  2,  period  3,  journey 
2,  which  was  identified  by  Qreen  (1987)  as  an  outlier.  The  large 
covariances  are  because  of  run  1,  journey  3  and  run  3,  journey  2, 
period  2  and  run  3,  journey  3,  4.  it  should  be  noted  that  in  run  5, 
all  responses  were  from  different  furnaces  than  were  used  in  the  other 
runs. 

2.i2  Caasiiuigna. 

The  distribution  of  the  diagnostics  for  a  bilinear  form  when 
the  sample  pairs  are  independent  and  not  Independent  has  been 
developed,  tabulated,  and  validated.  This  theory  has  been  extended  to 
the  diagnostic  tables  for  alt  random  and  mixed  designs.  For  the 
special  case  when  N  —  2,  it  has  been  shown  that  the  bilinear  form  for 
non-independent  sample  pairs  is  equivalent  to  the  Independent  case 
with  the  variance  doubled. 
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ABSlBAffl 

The  Source  Density  Function  is  a  four-parameter  class  of  one-sided  probability  density  functions. 
In  order  to  exploit  tire  Source  Density  function’s  flexibility  in  shape,  programs  were  developed  to 
estimate  the  parameters  which  maximize  the  log-likelihood  function  for  a  given  data  set. 


imopugim 

A  brief  review  of  the  Source  Density  Funcdon  (SDF)  is  presented  here  a  rigorous 
development  was  done  by  Lehnigk[l].  The  SDF,  f(x,P),  is  generated  from  a  delta  function  initial 
condition  solution  of  the  generalized  Feller  equation. 


f(x,P)  «pb-P  x-(P"P+l>/2  z(p+P-l)/2  Iq[2(xz/b)3/?*J  exp[-b‘P(xP+zP)]  (1) 


P-<«  b  p  p )• 

z>0;  b>0;  p<l;  p>0 


IqO  is  the  modified  Bessel  function  of  the  first  kind,  where  q*  -1  +  (l-p)/P  >  -1.  The  vector  P  is 
composed  of  the  four  parameters  which  are  calculated  so  that  the  log-likelihood  function  is 
maximized.  A  data  set  of  observations  is  formed,  which  is  composed  of  ordered  pairs  of  the 
observation  variable  xv,  and  the  relative  frequency  of  that  observation  fv.  The  data  set, 
{(xv,fv)lv»l,2,...,n  with  fo  and  fn  i* 0),  is  used  with  f(x,P)  to  form  the  log-likelihood  function 

♦<P>. 


n 

<j,(P)"IIfvln(f(Xv’P))  (2) 

v-1 


It  should  be  noted  that  as  z-^0  both  equations  (1)  and  (2)  approach  the  Hyper-Gamma  density  and 
log-likelihood  functions  for  X-0  [2],  This  will  be  refered  to  as  the  Hyper-Gamma  limit  of  the  SDF. 
A  transformation  of  the  parameters  is  useful  in  simplifying  the  equations. 
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a*zlV2 


(3) 


For  a  maximum  of  the  log-likelihood  function,  the  requirement  exists  that  all  of  the 
derivatives  of  4>(P)  must  equal  zero.  These  equations  place  a  further  restriction  on  o,  and  it  allows 
the  elimination  of  the  parameter  b  from  the  equations. 


bP  *  (B(p)  -  a2)  /  (1+q)  (4) 

n 

B(P)  fvexp(ppv)  (5) 

v-1 

pv  -  ln(xv)  (6) 


For  b>0,  it  is  required  that  B(|3)  -  a2  >  0,  thus  0  <  a  <4B(j5).  Equation  (4)  allows  the  elimination 
of  b  from  (2),  thus  is  a  three-parameter  equation. 


n 

<}>(a,M  -  In  P  +  H  ln(|j/(B(P)«a2) )  +  (up  -  1)C  -  B(P)+o2  +]P  fv  ln(  SM(rv) )  (7) 


(8) 


Sp,-i(r)  -  (2/r^-l  Iu.i(r)  (rtf)*/  ki  T(k+\x)  (9) 

k-0 


rv  -  2 pa  (B(P)-o2)-!  exp(ppv/2)  (10) 

H-  1+q  ,  (11) 


Equations  (4-11)  form  the  starting  point  for  the  numerical  estimation  of  the  source  density  function 
parameters  cr,  P,  and  \i. 
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cj  w  i  gum;*  •  n  w  *  ( •)  c  I •)  at*  ;  i  mri*}  ik<«  ^  »j  gcw » vs  a»JCL—  >  wci 


Initial  attemps  at  parameter  estimation  of  the  SDF  were  based  on  the  simultaneous  solution 
of  the  derivative  equations  of  (7)  set  equal  to  zero.  These  equations  had  the  following  form. 
n 

0  -  -a  +5  fv  (Sq(rv))*1  d  Sq(ry)  exp(ppy/2)  (12) 

v-1  drv 

n  n 

0  - (B(p)-o2)(l+|ipC)  -  up  fvpvexp(ppv)  -  ^5  fvPv  dSq(rv)  exp(Ppv)  (13) 

v*l  v*l  Sq(rv)  drv 

n 

0  -  pC  +ln(  \i  (B(P)  -  a2)-i)  +^Z.  lu  dSq<rv>  (14) 

v«l  Sq(rv)  dq 

A  three  dimensional  application  of  the  Newtcn-Raphson  method  was  used  The  functions  on  the 
tight  side  of  the  equal  sign  of  equations  (12*14)  were  used  to  form  a  vector,  F(o,p,|i)  and  a  3x3 
derivative  matrix  of  F(*)  was  numerically  calculated  This  matrix  was  inverted  and  premultiplied 
the  negative  of  F(<)  to  yield  a  change  vector  for  the  three  parameters.  This  method  failed  to  produce 
usefUl  results  due  to  the  complexity  of  the  $(•)  function  which  typically  had  differences  between  the 
a-derivadve  and  P-derivative  functions  that  typically  spanned  10  or  more  orders  of  magnitude.  The 
derivative  based  approach  was  abandoned  in  favor  of  direct  optimization  methods. 

Direct  optimization  of  various  log*likeiihood  functions  by  Powell’s  Method  have  been 
successful  [2, 3, 4, 5],  so  this  technique  was  applied  to  equation  (7).  Initial  runs,  with  the  starting 
point  close  to  the  actual  parameters  that  were  used  to  generate  the  data  sets,  were  successful.  But  as 
the  starting  point  was  moved  further  away  from  the  solution,  Powell’s  algorithm  ran  into 
difficulties  due  to  its  inability  to  deal  with  the  p -a  boundary  generated  by  B(P)*a2>0,  (and  the 
flatness  of  <j>(0 ). 

Powell’s  method  is  an  unconstrained  minimization  algorithm  To  change  a  maximum  into  a 
minimum,  the  function  is  multiplied  by  -1.  In  this  paper  all  equations  will  be  presented  as  they 
were  derived,  and  it  is  understood  that  the  log-likelihood  function  is  multiplied  by  -1  in  the 
computer  programs.  The  next  alteration  required  is  to  change  Powell’s  algorithm  into  a  constrained 
minimization.  For  the  Log-Normal,  generalized  Gumbel,  and  the  Hyper-Gamma  distributions  all 
of  the  constraints  were  implemented  in  the  calculation  of  the  log-likelihood  function.  If  in  the 
function  subroutine,  it  was  detected  that  a  parameter  had  gone  outside  the  allowable  region,  then 
the  function  would  force  the  offending  parameter  into  the  allowed  domain.  This  proved  satisfactory 
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since  for  these  distributions,  all  of  the  constraints  and  the  direction  vectors  for  Powell's  algorithm 
were  parallel  to  the  coordinate  axis  system,  but  for  the  SDF  this  was  not  the  case  on  the  (3-a 
boundary,  A  modification  to  the  Powell  algorithm  was  made  in  the  minimum  bracketing 
subroutine,  MNBRK.  If  the  function  detected  a  parameter  which  was  not  in  the  allowable  region  a 
flag  was  set,  this  flag  was  a  signal  to  NMBRK  that  a  constraint  had  been  crossed.  MNBRK  would 
then  bisect  the  interval  between  the  last  good  point  and  the  desired  point  which  had  crossed  the 
boundry,  and  then  try  this  new  point.  This  procedure  is  repeated  until  the  test  point  was  in  the 
allowable  region.  This  improved  the  region  of  convergence,  but  it  still  remained  too  limited. 

To  further  modify  Powell's  algorithm  to  get  a  better  convergence  criterion,  it  was  necessary 
to  examine  the  structure  of  the  log-likelihood  function  for  the  source  density  fUncdon.  Figures  1 
and  2  show  cuts  of  the  log-likelihood  fUncdon  as  it  varies  with  a  ((3  and  p  fixed)  and  pi  (b  and  p 
fixed)  with  the  two  fixed  parameters  set  at  the  soludon  values.  The  scales  on  these  plots  are  to 
demonstrate  the  flatness  of  the  funedon.  These  indicate  that  the  $(•)  fUncdon  is  a  well-behaved 
parabolic  type  fUncdon,  and  this  condnues  even  when  the  fixed  parameters  are  set  at  non-soludon 
values,  (of  course  with  its  extremum  value  decreased).  Unfortunately,  this  is  not  the  case  when 
<{>(•)  is  made  a  funedon  of  p,  with  p  and  a  set  at  the  soludon  values,  (shown  in  Figure  3).  During 
the  investigation  it  was  seen  that  the  left-hand  peak  of  Figure  3  was  the  extremum,  while  the  right- 
hand  was  a  false  extremum.  If  the  p  or  a  parameter  varied  off  of  the  solution  value,  the  two  peaks 
moved  towards  each  other  and  the  left-hand  peak  was  absorbed  into  the  right-hand  peak.  This 
demonstrates  the  existence  of  a  ridge  that  connects  the  two  maximums  of  Figure  3  together.  This 
ridge  must  be  followed  by  Powell’s  algorithm  to  locate  the  maximum.  Figure  4  shows  a  typical 
ridge  in  c-p-p  space. 


Figure  l.  0(a),  3  and  p  constant. 


Figure  2.  <j>(p),  a  and  p  constant. 


Figure  3. 4>(P).  0  and  |i  constant.  Figure  4.  Powell's  trajectory  in  cr-P-p,  space. 

At  first  inspection  it  appears  that  this  ridge  was  exactly  what  Powell  was  developed  for,  but 
there  are  problems  with  traveling  along  this  ridge.  The  first  difficulty  is  that  the  relative  change  in 
traveling  along  this  ridge  is  approximately  1  part  in  10s  to  106,  and  it  takes  numerous  iterations 
following  the  ridge.  When  the  relative  change  along  the  ridge  is  divided  by  the  number  of  iterations 
required  for  that  journey,  this  average  relative  change  is  usually  less  than  the  termination  criteria  for 
Powell’s  method.  Thus,  Powell  terminates  the  optimization  on  a  false  maximum.  Two  means  were 
employed  to  alleviate  this  problem.  First,  the  entire  Powell  subroutine  package  was  rewritten  to 
perform  all  calculations  in  double  precision.  The  evaluation  of  the  log-likelihood  function  was 
always  performed  in  double  precision  to  improve  accuracy.  With  Powell’s  subroutines  being  in 
double  precision,  the  termination  criteria  was  improved,  which  helped  to  increase  the  range  of 
convergence.  To  further  increase  the  convergence  area  an  amplifier  function,  equation  (15),  was 
applied  to  the  log-likelihood  function  for  a  second  pass  after  the  termination  criteria  was  satisfied 
on  the  first  pass  by  the  Powell  subroutine  package. 

y«e10(e10<'W*)- 1  )  (15) 

$*  was  the  final  value  of  4  from  the  first  pass  of  Powell’s  algorithm.  The  second  pass  of  Powell 
was  used  to  maximize  the  y  function.  The  amplifier  function  increases  the  slope  of  the  function, 
while  eiimi.  ring  the  large  dc-offset.  From  earlier  work  with  this  amplifier,  it  was  observed  that 
the  termination  criteria  was  effectively  changed  from  1  part  in  6x1 0*  (Powell  in  single  precision.  6 
and  y  calculated  in  double  precision)  to  1  pan  in  10 10.  The  actual  amount  of  increase  in  the 
effective  termination  criteria  on  <j>  is  dependent  on  the  difference  in  <f»  and  $*,  a  small  difference 
yielded  a  better  termination  criteria  (1  pan  in  1011)  while  a  large  difference  lessened  the  termination 
criteria  (1  pan  in  109).  Unfortunately;  these  modifications  did  not  fully  solve  the  problem,  but  they 
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did  help.  Occasionally,  the  function  was  so  flat  on  the  ridge  that  even  with  the  amplifier  function, 
Powell’s  termination  criteria  was  satisfied.  It  appears  that  increasing  the  gain  of  the  amplifier 
would  be  of  assistance,  but  Powell’s  trajectory  could  be  close  to  a  boundary  thus  causing  a  large 
change  which  would  result  in  an  overflow.  Restarting  the  amplifier  with  a  new  <j>*  did  help  to 
extend  the  range  of  convergence;  thus  Powell's  algorithm  was  running  with  three  passes,  one  plain 
and  two  with  the  amplifier. 

Even  with  this,  the  convergence  range  did  not  equal  the  allowable  3pace.  In  some  regions, 
the  Powell  algorithm  would  “lock”  on  to  a  false  maximum.  At  some  of  these  false  maximums,  a 
plot  of  <(>  as  a  function  of  one  parameter  a,  (3,  or  | x  would  show  a  maximum,  but  a  ridge  did  led 
away  from  this  point  in  a  direction  oblique  to  the  coordinate  axis.  During  initialization,  the  Powell 
subroutine  was  given  a  set  of  direction  vectors,  which  spanned  the  space,  and  Powell’s  method 
searched  for  successive  extremum  along  these  direction  vectors.  The  direcdon  vectors  were 
changed,  allowing  an  escape  from  the  original  false  maximum  but  it  would  usually  fall  prey  to 
another.  Similarly,  Powell’s  algorithm  at  times  needed  to  track  along  a  curved  ridge  or  boundary, 
but  this  would  trigger  a  similar  false  maximum.  To  get  past  the  false  maximum  problem,  a  steepest 
descent  subroutine  package  was  written.  This  method  was  successful  in  finding  the  ridge,  but  it 
failed  once  on  the  ridge,  due  to  the  flatness. 

A  variable  transformation  was  then  tried.  Changing  to  X  did  again  help  extend  Powell’s 

range, 

e*-B(|3)-<j2  (16) 

but  this  did  not  fully  solve  the  problems. 

Figure  5  is  the  computer  output  from  four  runs.  The  X,p  and  |i  values  are  the  initial  values. 
The  X,  p,  p,  b  and  o,  z  are  the  final  Powell  estimates.  All  four  runs  did  converge. 


CONCLUSIONS 

Application  of  Powell's  method  in  three  passes  does  produce  accurate  estimates  of  the 
parameters  of  the  Source  Density  Function.  The  major  drawback  is  the  requirement  of  a  starting 
point  that  lies  in  the  convergence  zone  of  the  global  maximum.  In  previous  programs  which 
utilized  the  maximum  log-likelihood  principle  with  distribution  such  as,  Log-Normal,  generalized 
Gumbel,  and  Hyper-Gamma,  the  moment  estimates  became  the  starting  point  for  Powell’s  method. 
The  moment  estimates  for  the  Source  Density  Function  require  simultaneous  solution  of  four 
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SOURCE  DENSITY  FUNCTION  CALCULATIONS 
Data  f 11a  la  \BDFDATA\8YNEX1B.0AT 
Lambda  -  , S00000E+001 

Bata  “  . 300000E+00 1 

Mu  ■  . 300000E+00 1 

»#«««»««*«##*tt#»»**##«*#*«»*«*tt*****#*«**«**«*##«*»« 

THE  ROWELL  ESTIMATES  FOR  THE  80URCE  DENSITY  FUNCTION 
Lambda  -  . 120232E+002 

Bata  -  . 462324E+00 1 

Mu  -  . 207676E+00 1 

b  -  . 1 1 802 1 E+002 

Sigma  -  . 34323 1 E+003 

*  -  . 1 24994E+002  _ 

Lambda  ■  . S00000E+00 1 

Sata  -  . 300000E+00 1 

Mu  -  . 300000E+001 

*«*«**#»tt**««#*tt***««*««tt«*««*tf  *##**«««#**«’*#*#*#*«-!<' 

THE  ROWELL  ESTIMATES  FOR  THE  SOURCE  0EN81TY  FUNCTION 
Lambda  •  . 120232E+002 

Bata  -  . 462324E+00 1 

Mu  -  . 207876E+00 1 

b  -  .118021 E+002 

Sigma  -  . 34323 1 E+003 

x  -  . 124994E+002 

Lambda  -  .000000E+000 

Bata  -  . 300000E+00 1 

Mu  -  . 300000E+00 1 


THE  POWELL  ESTIMATES  FOR  THE  SOURCE  DENSITY  FUNCTION 


Lambda 

m 

.  1 20229E+002 

Bata 

m 

. 4623 1 3E+001 

Mu 

m 

.207679E+001 

b 

m 

.  1  1B019E  +  002 

Sigma 

m 

.  343 1 93E+003 

z 

■ 

.  1 24995E  +  002 

Lambda 

■ 

.  800000E+00 1 

Bata 

■ 

.  S00000E+00 1 

Mu 

■ 

.  300000E+001 

THE  POWELL 

ESTIMATES  FOR  THE  SOURCE  DENSITY  FUNCTION 

Lambda 

M 

.  1 2024 1 E  +  002 

Bat# 

■ 

. 46238 1 E+001 

Mu 

■ 

. 207666E+001 

b 

m 

.  1  18027E  +  002 

Sigma 

m 

.  343338E+003 

z 

■ 

.  1 24992E+002 

Figure  5.  Four  sample  runs. 
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nonlinear  equations,  and  this  has  proved  to  be  more  difficult  than  the  maximum  log-likelihood 
estimate. 
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ABSTRACT 


A  hunter  attempts  to  detect  and  kill  targets  within  a  field  of  obscuring  elements,  which 
are  randomly  dispersed  (trees  in  a  forest).  The  targets  move  along  paths  in  the  field,  which 
are  partially  obscured  by  the  random  elements,  When  a  target  enters  a  visible  segment  of 
a  path  it  takes  tQ  [seconds]  to  detect  it,  and  t\  [seconds]  to  attempt  destroying  it.  If  such 
a.  trial  is  not  successful,  other  independent  trials  can  be  performed  as  long  as  the  target  is 
visible.  The  number  of  shooting  trials  that  can  be  attempted  depends  on  the  number  and 
lengths  of  the  visible  portions  of  the  path.  Lower  and  upper  bounds  for  the  probability 
of  destroying  a  target  are  determined  by  using  the  methods  of  random  visibility  measures 
previously  developed  by  the  authors. 


Kcv  Words:  Poisson  Shadowing  process,  Bernoulli  Trials, 

Visibility  Probabilities,  r -reduced  measure  of 
Visibility,  Detection  Probability,  Hitting 
Probability, 
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0.  Introduction 

A  hunter  is  trying  to  detect  and  hit  a  target  in  n  forest,  Suppose  that  a  target  is  moving 
along  a  path  in  the  forest  and  the  hunter  is  located  among  the  trees  at  some  distance  from 
the  path.  The  path  is  only  partially  visible  to  the  hunter;  the  invisible  (shadowed)  portion 
of  the  path  is  obscured  by  the  trees  which  arc  dispersed  randomly  between  the  hunter  and 
the  path.  A  target  can  be  detected  by  the  hunter  if  at  least  a  certain  part,  of  it  is  visible. 
After  detection  of  a  target,  the  hunter  starts  shooting,  The  target  continues  to  move  along 
the  path  in  the  same  pace.  During  each  shooting  trial  the  target  crosses  a  length  of  r 
of  the  path.  Thus  the  number  of  shooting  trials  in  each  visible  segment,  depends  on  the 
length  of  the  segment,  The  shooting  trials  stop  either  when  the  target  is  hit  or  when  it 
enters  an  invisible  portion  of  the  path.  When  the  target,  enters  another  visible  segment.,  it 
has  to  be  detected  again.  For  simplicity  we  assume  that  the  shooting  trials  are  Bernoulli, 
with  probability  of  failure  q ,  0  <  q  <  1 . 

The  problem  of  target  hunting  can  be  treated  as  a  two  or  three  dimensional  shadow¬ 
ing  problem.  Two  dimensional  random  shadowing  problems  were  previously  studied  by 
Chernoff  and  Daly  [1].  Likhterov  and  Gurin  [2],  Yadin  and  Zacks  [3,4],  The  methodology 
developed  in  the  present  paper  is  also  applicable  to  three  dimensional  versions  of  the  above 
problem.  For  example,  if  a  hunter  tries  to  shoot  down  n  helicopter  whose  flying  course 
Is  partially  obscured  by  crowns  of  trees.  The  three  dimensional  shadowing  problem  was 
previously  studied  by  Yadin  and  Zacks  [5]. 

In  the  present  study  we  develop  approximations  for  (a)  the  probability  of  detection; 
(b)  the  probability  distribution  of  the  maximal  number  of  shooting  trials  N;  and  (e)  the 
probability  of  survival  of  the  target.  We  also  provide  numerical  examples  to  illustrate  the 
goodness  of  these  approximations. 

1,  The  Model,  Measures  of  Visibility  and  Failure  Probabilities 

Suppose  that  the  hunter  is  located  at  the  origin,  0,  and  let  C  denote  the  path  of 

tin;  target.  C  is  assumed  to  be  a  smooth  star  shaped  curve,  defined  by  a  piece- wise 

differentiable  function  r(s),  »i  <  s  <  sy ,  representing  the  distance  from  0  to  C  in 

orientation  s.  The  polar  coordinates  of  a  point  P  on  C  are  ( ?*(•$),).  The  end-points  of 

C  are  P  and  P  .The  length  of  C  is 

~  »L  ~»U 


where 


l{s)  =  (r2(s)  +  (^f{s))2]if2 

The  trees  in  the  forest  are  presented  by  random  disks  dispersed  in  a  region  between  0  and 

C .  Each  random  disk  is  characterized  by  coordinates  (/;,0,  t/).  where  are  the  polar 

coordinates  of  its  center  and  y  is  its  diameter,  The  coordinates  (y,0,y)  belong  to  a  set  S 
in  R 3  satisfying  conditions  which  assure  that  0  is  not  covered  and  C  is  not  intersected  by 

random  disks,  Let  B  be  the  Borel  a -field  on  the  sample  space  5 ,  and  let  Ar{/?}  designate 
the  number  of  disks  whose  coordinates 'belong  to  a  set  B  of  B ,  We  assume  that.,  for  each 
BeB,  N{B)  is  a  random  variable  having  a  Poisson  distribution  with  mean 


v{B) 


(1.2) 
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where  G(y\p,6)  is  the  conditional  CDF  of  y,  given  (p,0),  and  H(dp,d6)  is  a  <r -finite 
measure  of  (/>,  0).  Such  a  random  field  of  disks  is  called  a  Poisson  random  field. 

A  point  P  on  C  is  said  to  be  visible  if  the  line  segment  OP  is  not  intersected  by  any 

random  disk.  A  point  which  is  not  visible  is  in  a  shadow.  The  measure  of  total  visibility 
on  C  is  defined  as 

V'=  r  r(s)l(s)ds  ,  (1.3) 

where  I{»)  =  1  if  P  is  visible,  and  I(&)  =  0  otherwise,  Notice  that  V  is  a  random 

variable  representing  the  total  length  of  the  visible  portion  of  C .  V  is  a  sum  of  a  random 
number,  M ,  of  visible  segments  of  C  having  random  length  A\ .  A'2, . . . .  X\i ;  i.e. 

to 

V  =  YJX,  .  (1.4) 

in) 

A  target  is  detected  only  if  there  exists  at  least  one  visible  segment  of  length  greater 
than  the  minimal  path  length  r0  required  for  identifying  the  target.  In  order  to  de¬ 
velop  a  formula  for  the  probability  of  detecting  a  target,  we  introduce  the  notion  of 
r -reduced  visibility  measure.  V(r ),  which  is  the  total  length  of  visible  segments,  each 
one  reduced  by  r  units,  i.e., 

M 

V(r)  -  £  (.V,  -  r)+  (1.5) 

i-1 

where  <x+  =  max  (a,  0).  The  probability  that  a  target  is  not  detected  is 

po(re)^Pr{V(ro)=s0}.  (1.6) 

On  the  other  hand,  the  probability  that  C  is  completely  visible  is 

pi  —  Pr{V(r)  a*  L  -  r},  for  all  0  <  r  <  L.  (1.7) 

Indeed,  when  C  is  completely  visible,  A/  =  1  and  Xi  —  L.  Let  N  denote  the  number 
of  shooting  trials,  after  detecting  a  target.  If  a  single  shooting  trial  requires  a  setment  of 
length  r  to  be  completely  visible,  then 

to 

N  =  ^2  [(*!-’■.)+/’■].  (IS) 

■«i 


where  [a]  is  the  maximal  integer  not  exceeding  a.  Notice  that 

1  to  *  to 

±]T  (Xi  -  r„  -  r)+  <  (Xi  -  r)+  (1.9) 

i«!  i=l 


Hence,  according  to  (1.5)  and  (1.9), 

V(tx)/t  <  N  <  V(t0)/t 


(1.10) 
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where  rj  =  t0  -f  r . 

If  the  probability  of  failure  in  each  shooting  trial  is  q.  and  the  shooting  trials  are  inde¬ 
pendent  (Bernoulli),  the  number  of  shooting  trials  required  until  the  first  success.  J,  is 
distributed  geometrically.  Accordingly,  the  probability  of  failure  (not  hitting  the  target)  is 
Q  =  E{qN).  Thus,  according  to  (1.10).  lower  and  upper  bounds  for  Q  are,  respectively, 
Qo  and  Qj ,  where 

Qi  =  E{qV(r;)/r}  .f  =  0,l  (1.11) 

Notice  that  Qi  is  the  value  of  the  MGF  of  V'(r,)  at.  the  point  /  =  (log  q)fr . 


2.  The  Moments  and  Moment  Generating  Function  of  Vr(r). 

For  the  sake  of  determining  the  moments  of  V(r)  we  introduce  the  following  definition 
of  this  measure, 


Ir(s)I(s)ds 


(2.1) 


where  Jr(s)  =  1  if  a  segment,  of  C  of  length  r,  centered  at  (r(s),s)  is  completely  visible, 
and  /r(s)  =  0  otherwise.  sj,ir  and  si/)T  are  direction  coordinates  of  points  within  C,  of 
distance  r/2  along  C  from  $l  and  s\j  respectively.  More  formally,  let 


r»u 

L(s)  =  /  l{y)dy. 
J*L 

Then,  SiiT  =  L-,(r/2)  and  su,r  =  L~X{L  —  r/2). 
Tire  n-th  moments  of  V(r)  is  thus 


Ir(Ms)ds)"} 


ln(T)  =  E{(  r 
J  n.r 

=  n’.[  ...  f  Eif[lrUi)}f[lUi)d3i. 

Ja-.-  j  r=i  ii 


The  set  .4n,r  is  the  simplex 

-4 n , r  -=  {($!»•••'»  );  5 L,r—  ^  } 


(2.2) 


(2.3) 


(2.4) 


Furthermore,  E{  JJ /,-(•>',')}  is  the  probability  that,  the  union  of  n  segments  of  C,  each  one 

i=i 

of  length  r,  centered  at  n  points  having  direction  coordinates  s\  <  ...  <  s„,  is  completely 
visible.  This  probability  is  designated  by  p„(si , . . .  ,s„;  r).  Thus  the  n-th  moment  of  V(r) 
is 

In (r)  =  n!  / 

J  A 

The  method  for  determining  pn($i  »•••, -s,,;  r)  and  //,  ,(r)  is  based  on  a  general  methodology 
developed  by  Yadin  and  Zacks  [3,4]  for  the  special  case  of  r  =  0  the  modifications  required 
for  r  >  0,  are  given  in  a  Technical  Report  [G], 
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The  cumulative  distribution  function  (CDF)  of  V(r)  is  a  mixture  of  a  two-point  distribu¬ 
tion  concentrated  on  {0,  L  —  r)  and  a  distribution  concentrated  on  the  interval  (0,  L-r), 
For  the  purpose  of  presenting  tne  approximation  discussed  below,  we  consider  a  normalized 
measure  of  visibility  W{t)  =  V(r)/(L  -  r),  which  is  concentrated  on  [0,1].  The  CDF  of 
TV(r)  can  be  represented  as 


(  0 


Fr(w)  m 


p0(r)  +  (1  -  p0(r)  -  pi)F  *r  (tv) 
1 


,  if  xv  <  0 
,  0  <  w  <  1 
,  1  <  w  . 


(3.1) 


If,  for  example,  G(j/|p,  0)  is  absolutely  continuous  then  F’(u>)  is  an  absolutely  continuous 
CDF  on  (0,1).  Let  pn(r)  denote  the  n-th  moment,  of  W(r),  Obviously,  rjn(r)  =  (L  - 
r)nHn (r),n-  =  1,2,...  . 

Furthermore,  for  n  =  1, 2, . . . 


t*n(r)  =  Pi  +  (1  -  l>o(r)  -  pi)  /  wndF*(w), 

Jo 


(3.2) 


Applying  the  Dominated  Convergence  Theorem  one  immediately  proves  that  lim  /in(r)  =* 

M  — *0O 

Pi  for  all  r  >  0 . 

Explicit  expressions  for  p0(r)  and  F*(w)  are  not  available.  We  apply  here  a  beta 
approximation  to  F*(w)  and  provide  a  numerical  approximation  to  p0(r),  This  type  of 
mixed-beta  approximation  was  applied  also  in  [3,4,5].  As  will  be  shown  in  Section  6,  in 
some  special  cases,  the  first  ten  moments  of  W(r )  and  of  the  mixed -beta  approximation  are 
very  close.  This  indicates  that  in  those  cases  one  has  a  highly  effective  approximation.  In 
cases  where  the  moments  are  not  in  agreement  better  approximation  should  be  attempted. 
The  approximating  beta-mixture  CDF  is  given  by  the  formula 


Fr(w)  =  { 


0 

pa(r)  +  (1  -  p0(r )  -  pi)Iw(otr^r) 
1 


,  if  w  <  0 
,  0  <  w  <  1 
,  if  1  <  w 


(3.3) 


where  Iw(a,  /?),  0  <  w  <  1,0  <  cv,/?  <  oo,  denotes  the  incomplete  beta  function  ratio. 
The  probability  pi  of  complete  visibility  of  the  segment  (Sl,Su)  of  C  is  determined  by 
the  shadowing  model,  as  shown  later.  The  values  of  p0(r),a>  and  0r  are  determined  by 
equating  the  formulae  of  the  first  three  moments  of  Fr(w)  to  those  of  W(r),  as  shown  in 
(3). 

4.  Bounds  for  the  CDF  of  N  and  for  Q 

Inequality  (1.10)  yields  lower  and  upper  bounds  for  the  CDF  of  N .  Indeed,  from  (1.10), 

Frt{  -y— - )  <  Pr{N  <  »}  <  Fr,  (  y~~ — )  (4.1) 

The  CDF’s  in  (4.1)  can  be  approximated  by  the  mixed-beta  CDF  (3.3).  According  to 
(1.11),  the  lower  and  upper  bounds,  for  the  failure  probability  Q,  are  the  vaiue  of  the 
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MGF  of  W(Ti)i  =  0, 1,  at  the  point  t  =  £ (I  -  r<)  log  q.  Let  Gr(t)  indicate  the  MGF  of 
VT(r).  This  function  can  be  expressed  in  terms  of  the  moments  of  W(r)  as 

Gr(t)  =  1  +  pi(e(  -  1)  +  -oo  <  t  <  oo.  (4.2) 

n*l 


Since  pjl (r)  |  0  us  n  grows  the  infinite  series  in  (4.2)  converges  faster  than  e* ,  and 
therefore  a  small  number  of  terms  will  often  provide  tv  good  approximation.  Another 
method  of  approximating  GT(t )  is  by  employing  the  MGF  of  the  mixed-beta  distribution 
(3.3)  with  p(r),ar  and  0r, 

we  provide  an  example  which  demonstrates  numerically  the  results 
of  the  present  paper.  We  consider  the  case  of  an  arc  C  and  annular  strip  S ,  which  wan 
discussed  in  Section  6.1.  The  parameters  of  this  case  are: 

$i  ss  — tt/2,  sl  =  -tr/3 ,su  =  tr/3,  0u  =  tr/2,r  =  1,  u>  =  ,6,  u  =  ,4,  A  -  6. 

In  addition,  the  diameters  are  uniformly  distributed  over  the  interval  (.1,  .6), 

In  Table5.1  we  present  the  first  10  moments  of  W(r),  for  r  —  Q(.l).4.  The  correspond¬ 
ing  moments  of  the  mixed-beta  distribution  (3.3)  are  also  given  for  comparison, 

As  shown  in  Table  5.1,  the  first  ten  moments  obtained  from  the  mixed-beta  CDF,  Fr(w), 
differ  from  those  of  the  correct  distribution  only  at^the  4th  decimal  place.  This  reveals  an 
excellent  approximation  to  the  CDF  of  TV(i)  by  J?r(u>),  in  the  etwe  under  consideration. 
In  Table  5.2  we  provide  the  parameters  of  the  mixed-beta  distributions  associated  with 
Table  5.1. 

The  values  of  p„(r)  in  Table  5.2,  provide  the  mixed-beta  approximations  to  the  probabil¬ 
ities  p«(r0)  of  not  detecting  a  target.  This  is  obviously  an  increasing  function  of  r0 .  Thus, 
in  the  present  example,  if  r0  =  ,l,p0(r0)  =  .012  while  if  r0  =  .4.po(r0)  =  ,043.  pi  =  .27  is 
the  probability  of  complete  visibility  along  the  path.  Since  the  moments  of  the  mixed-beta 
distributions  Fr(u>)  fitted  so  well  those  of  W(r) ,  we  replace  Fr,  ( 7—77)  with  Fr.  ( 7777), «  = 
0,1.  In  Table  5.3  we  present  Fr,( 7777)  for  t<  =  0(,l).4,r  =  .1. 

The  values  of  Qi  =5  E  exp{fjW(r,)}}  where  t.  =  - -  log  {q)  with  q  =  .8,  arc 

also  given  in  Table  5.3. 

As  seen  in  Table  5.3,  if  r  =  .1  and  r0  =  .1  the  lower  bound  of  Q  is  .0967  and  the  upper 
bound  for  Q  is  .1273.  If  however,  rp  =  0  then  .0704  <  Q  <  .0907 . 

The  bounds  for  the  CDF  of  N  are  read  from  Table  5.3  in  a  similar  manner.  For  example, 
if  r0  e  0,ri  ss  .1  +  T0  =  .1  then  for  n  =  6,  .0435  <  P{N  <  0}  <  .0785.  If,  t0  =  .1  then 
ri  «a  .1  +  t0  s*  .2  and  .0785  <  P{N  <  6}  <  .1253.  Thus,  from  the  first  two  columns  of 
Table  5.3  we  obtain  that,  when  rp  -  0,  the  expected  number  of  trials,  E{N} ,  is  between 
13.7  and  15.1. 
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.299  .294  .28 

.300  .294!  .29 


TABLE  5.1  Momenta  of  W(t)  (upper  line)  and  of  F  (w)  ( lower 
Tina)  '~£or"— t  «  0(.1).4  and  n  ■  1,  ...»  10. 


3.0675 


.4  .3069 


.  0298 


.0431 


2.8000 


2.6076 


2.4814 


1.8088 


2.0334 


2.1640 


2.3093 


2.4808 


TABLE  5.2.  The  Parameters  of  the  Mixed-Beta  Distribution 

— —  - -  1  1  _  ""  '  ‘  .  -  -  — 

Ft (w)  for  t  *  0(.1).4,  (o  denotes  the  standard 
deviations. ) 


T"~ 

0,0 

IBM 

0.2 

0.3 

0.4 

0 

0.0064 

1 

0.0119 

0.0194 

0.0298 

0.0431 

1 

0.0065 

0.0122 

0.020J 

0.0318 

0.0466 

0.0075 

0.0147 

0.0255 

0.0411 

0.0613 

0.0104 

0.0211 

0.0375 

0.0804 

0.0894 

0.0166 

0.0332 

0.0577 

0.0906 

0.1307 

0.0273 

0.0520 

0.0870 

0.1314 

0.1836 

0.0435 

0 .0785 

0.1253 

0.1819 

0.2458 

0.0662 

0.1131 

0.1723 

0.2405 

0.3145 

0.0960 

0.1557 

0.2269 

0.3052 

0.3865 

0.1333 

0.2059 

0.2879 

0.3736 

0.4585 

10 

0.1782 

0.2628 

0.3532 

0.4430 

0.5272 

11 

0.2303 

0.3252 

0.4209 

0.5106 

0.5894 

12 

0.2890 

0.3914 

0.4884 

0.5735 

0.6423 

13 

0.3531 

0.4592 

0.5529 

0.6288 

0.6836 

14 

0.4208 

0.5261 

0.6115 

0.6738 

0.7117 

15 

0.4902 

0.5891 

0.6613 

0.7063 

0.7265 

16 

0.5582 

0.6448 

0.6992 

0.7249 

1 . 0000 

17 

0.6213 

0.6896 

0.7226 

1.0000 

1.0000 

18 

0 .8750 

0.7183 

1.0000 

1  .0000 

1 .0000 

0.7139 

1.0000 

1.0000 

1.0000 

1.0000 

1.0000 

1 . 0000 

1.0000 

1.0000 

1 .0000 

a 

0.0704 

0.0967 

0.1273 

0.1621 

0.2000 

TABLE  5.3,  The  CDF  F  ( 7—-)  ,  with  T  -  .1,  t,«0(.1).4, 

-  ‘i  L"Ti  -  i 


L-s u"sL;  and  the  corresponding  MGF  . 
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COMBAT  MODELING 


One  afternoon  of  the  35th  Conference  on  the  Design  of  Experiments  in  Army 
Research,  Development  and  Testing  was  devoted  to  a  Special  Session  in  the 
Important  area  of  combat  modeling.  First  on  the  agenda  was  a  paper  by  Donald 
H.  McCoy  entitled  “Statistical  Issues  Related  to  Combat  Modeling,"  and  is 
published  in  these  proceedings  in  the  format  of  a  slide  presentation.  The 
author  advised  the  editor  of  these  proceedings  that  most  of  the  slides  are 
self-explanatory;  some  are  not.  He  figures  that  anyone  who  really  wants  to 
follow  up  would  contact  him.  The  title  of  the  second  paper  planned  for  this 
session  was  “The  Ballistic  Research  Laboratory  Firepower  Control  Simulation 
from  Inception  to  Validation,"  and  is  published  in  these  proceedings. 
Unfortunately,  its  author,  Ann  E.M.  Brodeen,  was  unable  to  attend  the 
conference.  Her  place  on  the  agenda  was  filled  by  a  paper  entitled  "A 
Nonparametric  Approach  to  the  Validation  of  Stochastic  Simulation  Models"  by 
William  E.  Baker  and  Malcolm  S.  Taylor.  The  last  paper  of  the  Special  Session 
was  presented  by  Eugene  Dutoit.  The  attendees  were  given  a  thirty-page  handout 
that  he  prepared  for  the  convenience  of  the  analyst  who  has  to  examine  the 
results  of  force-on-force  combat  modeling.  He  provided  these  proceedings  an 
abstract  of  this  handout. 


Preceding  Page  Blank 


D.  HUE  McCOY 
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35th DESIGN  OF  EXPERIMENTS 
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TRADOC  ANALYSIS  COMMAND 
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TRADOC  ANALYSIS  COMMAND 


OTRAD0C  ANALYSIS  COMMAND 
_ (TRAC) _ 

MISSION 

The  mission  of  TRAC  it  to  oonduot  studies  and  analytla  to 
aupport  doctrine,  combat  and  training  developments  In  the 
Concept  Baaed  Requlrementa  Syatemi  lead  the  TRADOC  team 
oonduoting  major  studies  and  analysis;  and  develop  and 
maintain  analytlo  tools,  soenarios  and  simulations  for 
analysis  and  training  of  Airland  Battle  operations  worldwide 

GOALS 

★  LEADERSHIP 

A  Command  whose  leaders  at  all  levels  possess  the  highest 
standards  of  ethics  and  professionalism,  committed  to 
excellence  In  mission  aooompllshment  and  the  well-being  of 
subordinates 

★  CENTRALIZED  COMMAND  OF  ANALYSIS 

A  Command  whloh  provides  analytlo  service  based  on  a  well 
developed  and  managed  study  program  with  oorporate 
development  of  taskers  and  plans  and  fully  coordinated 
execution 


★  INTEGRATED  ANALYSIS 

A  Command  whose  analytic  prooess  ensures  a  balanced 
representation  and  linkage  of  the  Army's  functional  areas 
and  echilons  In  a  worldwide  Joint/combined  operations  and 
environments  which  are  simulated  and  analyzed 

★  DIRECTED  RESEARCH 

A  Command  which  continually  explores  emerging  technologies 
and  innovative  approaches  and  harness  them  to  Improve  the 
quality  and  timeliness  of  Its  analytic  products 

★  QUALITY  PRODUCTS 

A  Command  which  Is  committed  to  exceilance  In  Analysis  and 
delivers  timely,  high  quality  analysis  and  simulations  to 
meet  the  needs  of  Army  leaders  and  trainers 

★  PROFESSIONAL  WORKFORCE 

A  Command  composed  of  military  and  civilians  who  possess  the 
highest  ethical  and  professional  standards,  and  the  desire, 
skills  and  ability  to  produce  the  finest  analyses  for  the  army 
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PROBLEM  SOLVING  MODEL 
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AAWS-M  ANALYSIS 
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USES  SETUP/EXECUTION 

RESEARCH  &  EVALUATION  6-9  MONTHS 

FORCE  CAPABILITY  GRAPHICAL  INPUT 

REQUIREMENTS  ANALYSES  INPUT 


CO 

111 


CO 

UJ 

H 

03 

2h- 
3 
O  Q. 

w  H 

'  O 


m 

§ 

5 

10 

o 

I 


UJ 

2 

f- 

z 

3 

cc 


CO 

Q 

CC 

3 

ffi 

^  UJ 

o  cc 
<0 
OQ  o  _ 


UJ 

2 

P 

cc 

UJ 
§ 
UJ  CO 

9  2 


“  92 

!»|H 

=  QC  CO  1“  ^  * 
UJ  . 


CL  7 
< 


z 

< 

cc 

cc 

UJ 

H 


(0 

< 

UJ 

O 

o 


H 

o. 

E 

o 

co 

w! 


co 

UJ 

.1 

CD 

£ 


CO 

UJ 

o 

cc 

o 

LL 


UJ 

CL 

£ 

CC 


3d 

St 


2  uj 


z 

ZiUcHoQ 
O  >  -  S  uj 
UJ  O  CO  H 

j  UJ  D  < 

CO  >  J  Uj  CO  S  8  0  O 

UJ  s  O  O  S  O  5 

“  0  ^  UJ  >  <  co 

^  •*"  en  .  2  I  I 


5  K 

P  UJ 
CO  I 
h- 

22 

s* 

_  ^  0  . 


CL 
CC 

O  w 


Mv  m  V  wl  Am 

oooS<oujuj 


197 


TRADOC  ANALYSIS  COMMAND 


AGGREGATION 


z 

9 

H 

Z 

lZ 

LU 

G 


K 

2 

LU 

H 

5 

CO 

CC 

LU 

h- 

LU 

2 

< 

cc 

sc 

CL 

O 

LU 

CL 

2 


CO 

LU 

CO 


CO 

H 


LU 

* 

< 

2 


O 

F 


o 

co 

LU 

CC 


H 
Z 

LU 

el 

m 
CO  DC 
2^ 
UJQ 

52< 

LL  < 

o  111 

o$2 

Is 

o  . 

CC  m 

0  . 
w  LU 


LU 

CC 

< 

2 

LU 

-1 

mi 

Ol 

cc1 

CLI 


0 

2 

0 

I 

z 

h- 

g 

cc 

h“ 

o 

< 

0 

-J 

D 

< 

3 

z 

o 

5 

1— 

Z 

0 

o 

D 

H 

0 

CC 

g 

H 

< 

H 

< 

z 

o 

h- 

0 

o 

111 

LU 

h- 

LU 

§ 

G 

-J 

"•> 

< 

0 

> 

o 

Z 

-J 

w 

0 

0 

LU 

LU 

H 

b 

< 

< 

O 

o 

□ 

□ 

CL 

CL 

2 

2 

O 

O 

O 

o 

LU  9 
CC  H 

CO  cc 

LU  LL 
t  LU 

55  *< 

<  LU 

O  5 

£o 

H 
^  CO 


H  LLJ 

co  al 

LU  c 

cc  h 


LU 

a  < 
w  G  2 

h-  LU  CC 
LL  0  Z 
O  D  3 

■  • 

LU 

H 

o 


198 


DEAGGREGATION 


R 


LU 

h" 


Q 

111 

X 

2 

3 


O 

G 


LU 

0 

3 


B 


LU 

X 

h~ 

2 

O 

X 

LL 


o 

LL 


CO 

2 

LU 

«J 

CD 

o 

x 

X 


c^- 

H 

LU 


LU 

2 


o 00 

X  LU 
LL  X 

CO  *“ 

0 

2 

X 
LL 

0 
3  2 
GQ  LU 

^  LU 
O  -I 

h-  X 


LU 

X 

h- 

Ol 


C^' 

0 

LU 

O 

X 

3 

o 

0 

LU 

X 

G 

LU 

G 

Z 

LU 

X 

X 

LU 

LU 


CO 

X 

H 

0 

G 


O  3 
X  2 


199 


TRADOC  ANALYSIS  COMMAND 


VIC  ATTRITION 
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ATTRITION  COEFFICIENTS 


INTERACTION  OF  FOUR  PROCESSES 

•  LINE  OF  SIGHT 

•  TARGET  ACQUISITION 

•  TARGET  SELECTION 

•  FIRING  AND  KILLING 


EFK 


WHERE 


h  ■  PROBABILITY  THAT  A  TARGET  BEING  FIRED  ON  OR 
ACQUIRED  WILL  BE  DESTROYED  BY  THAT  FIRER 
BEFORE  LINE  OF  SIGHT  IS  LOST  OR  THE  TARGET 
IS  DESTROYED  BY  ANOTHER  FIRER. 

EFK  ■  EXPECTED  TIME  THAT  A  FIRER  SPENDS  FIRING 
AT  A  TARGET  WHICH  HE  HAS  ACQUIRED  AND 
SELECTED  WHEN  THE  ENGAGEMENT  ENDS  IN  A 
KILL  BY  THE  FIRER  (CONDITIONAL  KILL  RATE). 

PF  -  UNCONDITIONAL  PROBABILITY  OF  FIRING 
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ATTRITION  COEFFICIENT 
ASSUMPTIONS 


EXPONENTIAL  DISTRIBUTION  OF 

•  TIME  TO  ACQUIRE 

•  DURATIONS  IN  VISIBLE  OR  INVISIBLE  STATES 

•  TIME  TO  KILL 


EFFECTS  OF  AN  AGGREGATE  GROUP  CAN  BE 
REPRESENTED  BY  A  NUMBER  OF 
"AVERAGE"  ELEMENTS 
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BASIC  PARAMETERS 


•  NUMBER  OF  FiRERS 

•  WEAPON  CHARACTERISTICS 

-  RANGE 

-  FIELD  OF  REGARD 

•  NUMBER  OF  TARGETS  IN  RANGE 

•  PROBABILITY  OF  LINE  OF  SIGHT 

•  ACQUISITION  RATE 

•  RATE  OF  MOVING  OUT  OF  LINE  OF  SIGHT 

•  RATE  THAT  OTHER  WEAPONS  KILL  TARGETS 

•  SELECTION  PRIORITIES 

•  KILL  RATE 

•  FIRING  RATE 
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COMPARISON  OF  STOCHASTIC  MODEL  RESULTS 

Case i  Case  j 
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WILCOXON  RANK  SUM  (MANN-WHITNEY)  TEST 

TIES  ARE  CONSIDERED 
IS  THERE  A  BETTER  WAY  ? 
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Abstract 


The  Ballistic  Research  Laboratory  Firepower  Control  Simulation 
(BRLFCS)  is  designed,  in  part,  to  support  the  on-going  investigation  of 
new  ways  of  attacking  the  problem  of  data  distribution  on  the 
battlefield.  Ideally,  prior  to  being  utilized,  the  model  should  be  vali¬ 
dated,  Le .,  tested  whether  or  not  it  reasonably  approximates  the  process 
of  distributing  tactical  information  across  the  battlefield.  However, 
model  validation  generally  assumes  the  availability  of  empirical  data  in 
order  that  some  comparison  may  be  made  between  the  output  gen¬ 
erated  by  the  model  and  real-world  data.  Unfortunately,  a  very  limited 
empirical  data  base  exists  for  the  validation  process.  This  paper  pro¬ 
vides  an  overview  of  BRLFCS  related  issues,  t.e.,  characteristics,  sup¬ 
ported  applications,  planned  modifications.  More  importantly,  a  discus¬ 
sion  of  an  approach  proposed  by  Iman,  Helton,  and  Campbell  for  vali¬ 
dating  large-scale  computer  models  by  replacing  empirical  data  with 
model  output  will  be  presented  in  the  context  of  the  BRLFCS  validation 
process  [1,2]. 
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I.  Introduction 


The  BRLFCS  is  a  large-scale  information  distribution  model  developed  by  the  Weapon 
Systems  Technology  Branch  (WSTB),  System  Engineering  and  Concepts  Analysis  Division 
(SECAD),  BRL  Although  a  limited  verification  has  been  on-going  as  the  model  has  evolved, 
the  question  has  been  continually  raised  as  to  whether  the  model  could  be  statistically  vali¬ 
dated. 

Currently,  limited  data  exists  for  only  a  few  tactical  elements,  e.g.,  the  fire  support  team 
headquarters  (FIST  HQ),  the  Field  Artillery  Battalion  Fire  Direction  Center  (FA  Bn  FDC), 
of  the  several  included  in  the  BRLFCS.  This  data  was  collected  over  the  past  several  years 
from  statistically  designed  firepower  control  experiments  conducted  in  both  research  facility 
and  field  environments  [3, 4, 5, 6, 7],  From  the  scope  of  the  previous  tests,  it  became  evident 
that  significant  monetary  and  human  resources  must  be  expended  to  collect  firepower  control 
data  for  even  a  single  tactical  node.  However,  the  WSTB  is  constructing  its  own  Firepower 
Control  Research  Facility  (FCRF)  which  should  ease  past  resource  burdens  tremendously. 

Statistical  validation  of  the  BRLFCS  is  beset  by  not  only  the  lack  of  experimental  data, 
but  costly  simulation  runs  and  large  numbers  of  input  variables  with  differing  characteristics, 
e.g,  qualitative  and  quantitative,  discrete  and  continuous,  ranges  covering  several  orders  of 
magnitude.  These  are  all  familiar  problems  facing  anyone  wisliing  to  validate  a  large-scale 
simulation  model.  Although  there  has  been  innovative  research  done  in  this  area,  it,  too, 
assumes  the  availability  of  at  least  some  empirical  data  [8].  Fortunately,  there  is  a  technique 
which  holds  promise  for  validating  large-scale  models  encumbered  with  the  types  of 
aforementioned  problems.  This  generalized  technique  was  proposed  by  Iman,  Helton,  and 
Campbell  and  is  outlined  in  a  two-part  journal  article  [1,2]. 

This  paper  broadly  outlines  the  techniques  being  proposed  to  validate  the  BRLFCS  and 
the  preliminary  steps  which  have  been  completed  at  the  time  of  the  writing  of  this  paper  to 
place  the  validation  process  in  motion.  With  this  in  mind,  there  are  no  results  to  report  at  this 
time.  However,  the  author  would  like  to  solicit  comments  and  critiques  of  the  proposed  solu¬ 
tion  to  this  problem,  particularly  from  those  who  may  have  actually  used  the  methodology. 

II.  The  Ballistic  Research  Laboratory  Firepower  Control  Simulation 

a.  Characteristics 

The  BRLFCS  will  be  used  to  evaluate  brigade  (bde)  area  firepower  control  concepts  for 
maneuver  (mvr)  and  fire  support  elements.  It  is  not  intended  for  the  model  to  be  all- 
encompassing,  but  rather  to  provide  an  overview  of  the  distribution  of  tactical  information 
across  the  battlefield. 

Some  of  the  relevant  features  of  the  BRLFCS  are  presented  in  Figure  1.  The  version 
represented  is  a  maneuver  battalion  (mvr  bn)  supported  by  field  artillery  units  and  is  the  ver¬ 
sion  which  will  initially  be  validated.  There  also  exists  a  brigade  version  which  differs  from  the 
battalion  version  in  scale  only.  Of  particular  importance  with  regard  to  the  validation  process 
is  the  fact  the  BRLFCS  is  a  stochastic  model,  where  stochastic  model  is  hereby  defined  as 
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one  in  which,  for  each  set  of  input  values,  a  set  of  output  values  occurs  with  a  certain  proba¬ 
bility.  With  such  a  model,  any  number  of  the  input  variables  may  be  deterministic,  so  long 
at  least  one  is  stochastic.  Although  a  deterministic  simulation  was  initially  considered, 
order  to  meet  anticipated  needs,  a  certain  degree  of  randomness  was  built  into  the  model, 
with  the  capability  to  suppress  it  if  desired.  Therefore,  certain  features  of  the  BRLFCS  were 
also  designed  to  be  stochastic.  For  instance,  provision  was  built  in  to  select  the  time  a  mission 
is  initiated.  These  times  may  be  either  assigned  explicitly,  or  the  mission  initiation  rate,  4e., 
number  of  missions  per  hour,  can  be  given  and  the  times  assigned  based  on  a  random  number 
string. 


•  Land  Based 

•  Any  mix  of  Blue  Forces,  mvr  bde  and  below, 
including  relevant  fire  support 

•  Supports  any  conflict  for  which  data  transmission 
requirements  can  be  specified 

•  Resolution  down  to  individual  radios/data  distribution  units 
operates  with  ‘260  in  game;  provision  for  500 

•  Written  in  ’C 

•  Input  requirements:  networks;  units;  transmission  lengths 

and  times;  transmitter  characteristics  and  locations;  scenario  data 

•  Outputs:  unit  and  network  loadings;  queues;  message  and  mission 
timelines 

•  Full  scale  runs  made  on  a  CRAY 
Reduced  scale  runs  made  on  a  Gould  9600 

•  Transmissions  may  be  either  TACFIRE  or  packet  format 

•  Accomodates  both  TACFIRE  and  packet  switching  networks 

•  Processes  performed  in  parallel 


Figure  1.  BRLFCS  Features 
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3-S 


Units  and  nets  for  packet  switching  applications. 


b.  Concept 

Although  the  simulation  was  planned  so  that  it  will  be  able  to  support  future 
Army/DARPA  Command  &  Control  Project  (ADDCP)  activities,  its  principle  function  will 
be  to  demonstrate  and  evaluate  the  potential  of  new  concepts  of  dynamic  fire  support 
management  applications  at  the  fighting  level  (bde  and  below),  in  particular  the  BRL  Infor¬ 
mation  Distribution  System  (IDS)  fact-based  technique  [9].*  In  support  of  the  IDS,  the 
BRLFCS  will  be  used  to  predict  those  links  and/or  procedures  for  data  dissemination  that 
result  in  excessive  burdens  on  specific  tactical  nodes  or  networks,  and  to  determine  which 
aspects  of  the  information  flow  have  deleterious  effects  on  mission  duration  time  or  asset  util¬ 
ization. 

Overall,  utilization  of  the  computer  simulation  model  should  help  narrow  the  focus  of 
the  on-going  tactical  computer  science  research,  preventing  it  from  pursuing  "blind  alleys". 

c.  Planned  Modifications 

Since  the  BRLFCS  is  designed  to  address  specific  issues  while  continuing  to  support  the 
tactical  computer  science  research  effort,  the  simulation  can  be  modified  as  needed.  One 
such  issue  which  may  necessitate  investigation,  and  which  directly  impacts  the  build  up  of 
queues  in  the  network,  is  the  manner  in  which  high-priority  missions  entering  a  queue  are 
handled.  Normally  this  type  of  mission  should  be  inunediately  advanced  to  the  top  of  the 
queue  for  processing;  however,  the  BRLFCS  presently  handles  all  missions  on  a  first-in-first- 
out  (FIFO)  basis.  While  provision  has  already  been  built  into  the  model  to  accomodate  prior¬ 
ity  missions,  the  computer  code  has  not  yet  been  changed  to  address  this  Issue. 

Two  other  issues  which  the  simulation  does  not  presently  address  are  unit  attrition  and 
multi-path  information  routings,  Le.,  a  more  advanced  scheme  for  routing  packet  message 
types  (only)  around  the  battlefield.  These  two  issues  are  actually  related  in  that,  supposing  a 
unit  is  operating  at  reduced  efficiency,  it  may  become  desirable  to  reduce,  or  supress  alto¬ 
gether,  the  amount  of  message  traffic  passing  through  that  node.  Under  the  existing  routing 
algorithm  pattern  in  the  BRLFCS,  this  is  impossible.  As  can  be  seen  from  Figure  2,  the  net¬ 
works  are  now  connected  by  single  gateways  (located  at  nodes  49  -  52,  54,  56,  78,  and  80), 
thus  forcing  a  transmitted  packet  message  to  follow  a  single  path  regardless  of  the  number  of 
times  the  message  must  be  sent.  Such  a  scheme  may  allow  queues  of  unacceptable  length  to 
build  up  quickly. 


*  The  basic  concept  of  the  IDS  ii  to  design  a  lyitem  capable  of  representing,  storing,  disseminating,  ind  displaying  fact*  In  s  tactical 
distributed  computer  environment. 
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III.  Validation  of  the  BRLFCS  ! 

i; 

a.  Verification  and  "Face  Validation"  / 

During  the  course  of  its  evolution,  the  BRLFCS  has  been  undergoing  almost  continual  j; 
verification;  in  other  words,  the  correctness  of  the  model  is  being  established.  This  phase  ; 
may  be  loosely  described  as  "debugging"  the  program,  e.g.,  determining  the  reasonableness  of 
values  of  certain  model  input  variables  and  the  correctness  of  the  computer  coding  used.  The 
*C  program  language  allowed  the  BRLFCS  to  be  easily  structured  into  modules,  or  subpro-  ; 

grams.  By  running  the  model  using  data  employed  in  its  construction,  and  observing  the  out-  l 

put  from  these  modules,  both  the  developer  as  well  as  "experts"  knowledgeable  about  infer-  / 

mation  distribution  system  models  feel  comfortable  the  model  is  behaving  acceptably.  When  i. 

"experts"  are  insured  a  simulation  is  realistically  representing  the  assumptions  upon  which  it  is  ' 

based,  this  is  often  refered  to  as  a  model  having  "high-face  validity".  f 

performing  such  a  verification  is  allowing  for  a  more  efficient,  simpler  simulation 
design,  which  will  eventually  account  for  savings  in  computer  time.  Also,  by  previewing  the  ( 
output  of  the  simulation  modules,  an  experimenter  is  protected  against  anomolies  which 
might  occur  in  the  responses  when  the  model  is  used. 

b.  Anticipated  Validation  Approach(es) 

It  was  originally  envisioned  that  verification  and  "face  validation"  of  the  BRLFCS,  as  a 
complete  system,  would  be  the  best  that  even  recent  advancements  could  offer,  particularly  in 
light  of  the  difficulty  in  obtaining  experimental  data.  Winter,  et  al ,  states,  "The  quality  of  the 
component  models  and  the  excellent  knowledge  of  the  random  process  along  with  a  sys¬ 
tematic  verification  must  be  a  substitute  for  validation  [10]." 

However,  a  literature  search  unveiled  a  sensitivity  approach  to  the  validation  of  large- 
scale  computer  models,  which  to  the  author’s  knowledge,  has  not  been  utilized  at  the  BRL. 

The  approach  is  My  outlined  in  a  two-part  paper  by  Iman,  Helton,  and  Campbell.  Their 
approach  focuses  on  the  construction  of  a  response  surface  as  a  replacement  for  the  model. 
Underlying  this  approach  is  the  substitution  of  model  output  for  experimental  data  (due  to 
the  lack  thereof).  The  remainder  of  this  paper  will  highlight  some  of  the  features  and  stra¬ 
tegies  of  this  methodology  which  are  being  implemented  into  the  validation  of  the  BRLFCS 

Also  planned  is  a  statistical  validation  of  the  tactical  nodes  for  which  experimental  data 
already  exists  (and  which  is  independent  of  any  data  utilized  in  the  development  of  the  simu¬ 
lation).  Referring  to  Figure  2,  the  tactical  elements  which  will  be  validated  are  the  FIST  HQ, 
nodes  69  -  72;  Field  Artillery  Battalion  Commander  (FA  Bn  Cdr),  node  77;  FA  Battety  Fire 
Direction  Center  (FA  Btry  FDC)  positioned  at  the  FA  BTRY  HQ,  node  80.  Although  some 
similar  type  elements  may  be  currently  co-located  wiih  other  types,  or  may  even  change  their 
physical  location  in  future  applications,  they  are  otherwise  generic  in  nature,  e.g.t  the  func¬ 
tions  of  FIST  node  69  are  equivalent  to  FIST  node  70. 

The  approach  for  validating  these  nodes  will  entail  a  nonpaiametric  procedure  recently 
developed  by  Baker  and  Taylor  for  a  stochastic  computer  simulation  model  [8]. 
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IV.  Strategies  and  Features 

a.  Preliminary  Discussions  of  Model  Input  and  Output  Variables 

Although  numerous  types  of  descriptive  data  will  be  collected  during  each  simulation 
run,  three  model  outputs  have  been  identified  as  the  measures  that  will  be  used  In  validating 
the  BRLFCS.  The  three  outputs  are:  1)  net  usage,  Le.,  the  percent  of  time  a  specific  net  is 
occupied  by  message  transmissions;  2)  unit  utilization,  Le.,  the  percent  of  time  a  specific  unit 
is  occupied  with  handling  message  traffic;  and  3)  mission  duration. 

The  formats  of  the  required  BRLFCS  inputs  vary.  Some  require  the  simple  assignment 
of  a  numerical  value  for  program  identification  purposes  only,  e,g.,  packet  radios  assigned  a 
code  of  6,  while  others  are  strictly  deterministic  or  stochastic  in  nature.  Still  others  may 
currently  be  designated  as  either  deterministic  or  stochastic  as  mentioned  in  Section  II.a. 

Most  of  the  present  effort  focuses  on  discussions  being  held  between  the  model 
developer  and  the  analyst.  As  a  result  of  these  discussions,  several  issues  were  identied  as 
impacting  the  selection  of  an  appropriate  sensitivity  technique.  First,  the  developer  has  pro¬ 
vided  the  analyst  with  an  assessment  of  each  input  variable’s  anticipated  impact  on  the  model 
output  based  on  his  "expert"  opinion.  Second,  for  analysis  purposes,  it  is  being  assumed  that 
nonlinear  relationships  with  the  model  outputs  may  exist.  This  does  make  the  construction  of 
an  appropriate  response  surface  a  bit  more  tedious,  but  doable.  However,  it  is  also  being 
assumed  that  there  are  no  2-way  or  above  interactions  among  the  input  variables.  Third, 
since  the  three  output  measures  constitute  a  time  dependent  function  of  model  input,  each 
input  variable  must  be  examined  to  determine  whether  its  importance  changes  significantly 
over  time. 

b.  Input  Vector  and  Significant  Input  Variables  Selection  Techniques 

Obviously,  in  order  to  fit  a  response  surface,  model  output  must  be  obtained  for  various 
values  of  the  input  variables.  The  choice  of  which  sampling  scheme  to  use  to  select  values  for 
the  input  vectors  presented  a  problem.  Random  sampling  is  not  appropriate  and,  as  for  the 
other  possibilities,  eg.,  stratified  sampling,  double  sampling,  it  nearly  boiled  down  to  a  "grab 
bag"  selection  process.  The  sampling  technique  must  take  into  consideration  the  possibility 
that  or.e  or  more  of  the  input  variables  might  change  in  importance  over  time,  as  well  as 
insure  that  all  portions  of  each  variable’s  sample  space  will  be  represented  by  input  values, 
even  when  that  distribution  of  values  covers  several  orders  of  magnitude. 

The  Latin  Hypercube  Sampling  (LHS)  technique  claims  such  advantages  over  other, 
more  common,  sampling  schemes  [1,2,12],  Another  feature  of  this  technique  which  makes  it 
even  more  advantageous  to  the  BRLFCS  validation  process,  is  that  the  probability  distribu¬ 
tions  used  with  LHS  do  not  necessarily  have  to  be  the  "true"  distributions  In  fact,  if  preferred, 
the  range  of  values  for  the  input  variables  may  be  used  in  place  of  probability  distributions. 
For  the  majority  of  the  BRLFCS  input  variables,  their  ranges  of  values  is  the  only  information 
available. 
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c.  Input  Variable  Ranking  and  Response  Surface  Construction 


One  of  the  objectives  of  this  sensitivity  analysis  will  be  to  obtain  a  ranking  of  the  poten¬ 
tially  important  input  variables.  This  result  will  be  used  to  help  drive  factors  selected  for 
future  IDS  testing.  There  are  several  regression  techniques  which  may  be  used  to  select  a 
"best  subset"  of  the  predictor  variables.  For  the  BRLFCS  validation,  stepwise  regression  will 
be  utilized  initially  to  construct  a  response  surface  based  on  a  linear  combination  of  the 
independent  (input)  variables  [13]. 

Following  an  initial  fit,  several  things  should  be  checked,  e.g.,  is  the  fit  adequate,  con" 
sistcncy  of  independent  variable  selection  if  similar  dependent  variables  are  present,  are  the 
predictions  reasonable.  If  the  response  surface  is  not  providing  a  suitable  representation  for 
model  output,  then  additional  work  is  needed.  Earlier  it  was  mentioned  that  there  is  the  pos¬ 
sibility  that  the  relationship  between  some,  or  all,  of  the  BRLFCS  input  variables  and  the  out¬ 
puts  is  nonlinear.  Iman,  Helton,  and  Campbell  suggest  the  use  of  rank  regression  as 
developed  by  Iman  and  Conover  [14],  Rank  regression  is  a  relatively  simple  concept.  Data  are 
replaced  with  their  corresponding  ranks  whereby  usual  regression  procedures  may  be  per¬ 
formed  on  these  ranks. 

d.  Other  Statistical  Considerations 

Only  a  few  of  the  ideas  that  must  be  considered  for  the  validation  of  the  BRLFCS,  or  for 
that  matter  any  sensitivity  analysis,  have  been  outlined  using  Iman,  Helton,  and  Campbell  as  a 
guideline.  No  mention  was  made  with  regard  to  the  actual  validation  of  the  response  surface, 
the  various  diagnostic  tools  available  for  obtaining  preliminary  information  for  the  construc¬ 
tion  of  the  surface,  or  data  transformation.  These  issues  are  discussed  in  References  [1,2]. 

V.  Summary 


The  technique  outlined  by  Iman,  Helton,  and  Campbell  appears  to  be  a  viable  approach 
for  validating  the  BRLFCS.  Additionally,  the  use  of  the  nonparametric  technique  developed 
by  Baker  and  Taylor  for  stochastic  models  seems  appropriate  for  performing  a  statistical  vali¬ 
dation  of  those  tactical  nodes  for  which  experimental  data  exists. 

A  critique  of  these  approaches,  as  well  as  suggested  alternatives,  are  invited  by  the 
author. 
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ABSTRACT 


For  three  decades  interest  in  simulation  modeling  and  simulation  languages  has  been 
expanding,  almost  keeping  pace  with  the  phenomenal  rate  of  growth  of  computer  technology. 
Lagging  somewhat  behind  has  been  attention  to  the  validation  of  the  resulting  simulation 
models;  that  is,  the  establishment  of  some  level  of  confidence  that  the  model  does,  in  fact, 
accurately  mimic  some  real-world  process.  In  the  last  fifteen  years,  research  in  validation 
techniques  has  been  substantially  increased;  and  one  general  conclusion  has  been  that 
statistical  tests  are  desirable  in  the  validation  process. 

We  have  adapted  a  nonparametric  statistical  technique  to  validate  a  stochastic 
simulation,  and  this  procedure  has  subsequently  been  applied  to  a  computer  model  currently 
in  use  at  the  US  Army  Ballistic  Research  Laboratory.  Monte-Carlo  methods  have  provided 
an  indication  of  the  power  of  this  statistical  test. 


KEYWORDS:  Hypothesis  Testing,  Ranking  Procedures,  Power  of  Test 
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I.  INTRODUCTION 


For  three  decades  interest  in  simulation  modeling  and  simulation  languages  has  been 
expanding,  almost  keeping  pace  with  the  phenomenal  rate  of  growth  of  computer  technology. 
Lagging  somewhat  behind  has  been  the  concern  for  the  validation  of  the  resulting  simulation 
models;  that  is,  the  establishment  of  some  level  of  confidence  that  the  model  does,  in  fact, 
accurately  mimic  some  real-world  process.  In  the  last  fifteen  years,  research  in  validation 
techniques  has  been  substantially  increased;  and  a  consensus  of  general  conclusions  has 
formed: 

1.  validation  is  problem  dependent  •  there  is  no  one  general  validation  technique, 
mainly  because  the  output  from  a  model  may  be  independent  or  correlated, 
univariate  or  multivariate,  stationary  or  dynamic,  and  so  forth;  in  fact,  the  model 
itself  may  be  deterministic  or  stochastic, 

2.  in  general,  absolute  validity  is  nonexistent  -  once  a  particular  technique  has  been 
established,  the  model  is  usually  validated  only  for  a  specific  purpose  and  over  a 
specific  range  of  values, 

3.  empirical  data  are  necessary  -  in  order  to  validate  a  model,  some  comparison  of 
output  data  with  real-world  data  must  be  made;  furthermore,  these  empirical 
data  must  be  independent  of  those  used  in  construction  of  the  model,  and 

4.  statistical  tests  are  desirable  •  of  the  many  methods  proposed  for  validating 
simulation  models,  the  use  of  statistical  tests  seems  to  be  preferred,  possibly 
because  of  the  ability  to  establish  some  level  of  confidence. 

Nonparametric  validation  methods  generally  involve  a  procedure  known  as  hypothesis 
testing.  The  initial  step  is  to  state  a  null  hypothesis,  usually  "the  simulation  model  is  valid." 
Then  a  level  of  confidence  is  established,  often  95%;  and  a  particular  test  statistic  is  chosen. 
Two  different  errors  are  present  in  hypothesis  testing.  The  first  is  called  a  Type  I  error  and 
occurs  when  a  true  null  hypothesis  is  rejected.  If  the  level  of  confidence  has  been  set  at  95%, 
then  it  follows  that  the  probability  of  a  Type  I  error  is  5%.  However,  in  simulation  model 
validation  a  Type  II  error  is  the  more  important  to  control;  this  occurs  when  a  false  null 
hypothesis  is  accepted.  No  level  of  confidence  is  pre-established  to  guard  against  accepting 
an  invalid  model;  but,  for  any  particular  statistical  test,  a  measure  of  the  protection  against 
this  error  is  given  by  the  power  of  the  test,  equal  to  the  probability  of  rejecting  the  null 
hypothesis  when  it  is  false. 

Unfortunately,  there  is  a  tradeoff  between  the  two  error  types;  as  the  level  of  confidence 
is  increased  (lower  probability  of  a  Type  I  error),  the  power  of  the  test  is  decreased  (higher 
probability  of  a  Type  II  error).  This  implies  that  one  way  to  increase  the  power  of  a  test  is  to 
decrease  the  level  of  confidence  in  it.  There  are,  however,  more  satisfactory  ways;  and  they 
will  be  mentioned  in  the  summary  of  this  paper.  The  important  point  to  remember  is  that 
when  attempting  to  validate  a  simulation  model  using  hypothesis  testing,  it  is  imperative  that 
the  statistical  test  be  a  powerful  one. 
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II.  LITERATURE  REVIEW 


As  the  electronic  computer  became  a  more  powerful  tool,  computer  simulation  became 
a  more  viable  method  by  which  the  behavior  of  a  given  process  could  be  characterized.  As 
early  as  the  1950’s,  articles  were  being  published  about  computer  modeling  of  entire  systems; 
and  soon  after,  specialized  simulation  languages  were  developed.  The  pioneers  in  this  field 
realized  the  need  for  some  assurance  that  the  simulation  output  would  be  consistent  with  the 
empirical  data  that  were  available.  However,  prior  to  1967  there  was  very  little  written  that 
provided  any  explicit  procedures  which  might  be  applied  to  determine  the  soundness  of  a 
computer  model.  In  that  year  several  papers  concerning  this  problem  were  published,  and 
two  of  them  became  a  foundation  upon  which  most  subsequent  efforts  have  been  constructed. 

In  1967,  Fishman  and  Kiviat'  provided  definitions  which  differentiated  the  notions  of 
verification  and  validation,  terms  which  had  previously  been  used  interchangeably. 
"Verification  determines  whether  a  model  with  a  particular  mathematical  structure  and  data 
base  actually  behaves  as  an  experimenter  assumes  it  does.  Validation  tests  whether  a 
simulation  model  reasonably  approximates  a  real  system."  Most  individuals  working  in  this 
area  today  have  subscribed  to  these  definitions,  although  papers  continue  to  be  published 
which  do  not  discriminate  between  the  two  ideas.  Figure  1,  taken  from  a  paper  by  Winter,  et. 
al. ,  is  a  Venn  diagram  illustrating  the  relationship  between  verification,  validation,  and  other 
concepts  within  the  field  of  computer  simulation.  Stone3  believed  the  word  assessment "...  is 
preferable  to  validation  which  has  a  ring  of  excessive  confidence  about  it."  However,  in  this 
paper  we  will  continue  to  consider  validation  as  defined  by  Van  Horn,4  who  expanded  on  the 
previous  definition  by  giving  it  a  somewhat  statistical  flavor.  "Validation  ...  is  the  process  of 
building  an  acceptable  level  of  confidence  that  an  inference  about  a  simulated  process  is  a 
correct  or  valid  inference  for  the  actual  process." 


*  Ftahman,  0.3.  and  KMat,  PJ„  "Digital  Computer  Simulation:  Statistical  Ccnildaratlona,*  Memorandum  RM-5187-PR,  The  Rand 
Corporation,  1967. 

2 

Winter,  EM.,  WUemlllar,  D.P.,  and  UJIhara,  J.K.,  "Verification  and  Validation  of  Engineering  Slmuletione  with  Minimal  Data," 
Proceeding  of  the  1976  Summer  Computer  Simulation  Conference,  1976. 

3 

Slone,  M.,  "Croae-Validating  Choice  and  Aaeeaar.ient  of  Statistical  Prediction,'  Joumil  of  the  Royal  Statistical  Society.  Serite  B-36, 1974. 

4  Van  Horn,  K.,  "Validation," The  Dselsn  of  Computer  Simulation  Experiment*,  Duke  Unhandy  Preae,  1969. 
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FIGURE  u  RELATIONSHIPS  BETWEEN  THE  VARIOUS  CONCEPTS  OF  A  COMPUTER  SIMULATION 


The  second  influential  paper  to  appear  in  1967  was  by  Naylor  and  Finger.5  In  it  they 
proposed  a  *hree -stage  approach  to  validation  of  a  computer  simulation.  This  technique,  or  a 
modified  version  of  it,  has  been  used  by  numerous  authors.  Law6  has  augmented  their 
approach  with  specific  suggestions  for  each  of  the  three  stages: 

1.  develop  high  face-validity  -  insure  that  the  simulation  seems  reasonable  to  those 
people  who  are  knowledgeable  in  the  area, 

2.  test  the  simulation  assumptions  •  examine  the  data  used  in  building  the 
simulation  and  empirically  test  the  assumptions  drawn  from  those  data,  and 

3.  compare  simulation  output  data  with  empirical  data  -  use  tests,  statistical  if 
possible,  to  determine  a  level  of  confidence  in  the  simulation. 

When  attempting  to  validate  existing  models,  the  first  two  stages  will  often  have  already 
been  completed  by  the  developer  of  the  simulation  leaving  only  the  third  stage,  potentially  the 
most  difficult. 


5  Naylor,  T.H.  and  Fitter,  J.M.,  *V«rineation  of  Computar  Simulation  Modila,'  Manaaemant  gcjanw,  Vol.14  NoU,  1967, 

6 

Law,  AJi(„  Simulation  Modallna  and  Analvala,  UnivanltyofWIieonaln,  1979. 
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Not  everyone  subscribes  to  the  three-stage  approach  to  validation.  However,  there  does 
seem  to  be  a  general  agreement  that  the  third  stage,  comparing  simulation  output  data  with 
empirical  data,  is  crucial.  Sometimes  obtaining  empirical  data  in  the  region  of  applicability  is 
very  difficult,  especially  in  engineering  simulations.  Winter,  et.  al.  mention  in  that  case,  "The 
quality  of  the  component  models  and  the  excellent  knowledge  of  the  random  process  along 
with  a  systematic  verification  must  be  a  substitute  for  validation."  However,  Fishman  and 
Kiviat1  are  firm  in  their  statement  that " ...  if  no  numerical  data  exist  for  an  actual  system,  it  is 
not  possible  to  establish  the  quantitative  congruence  of  a  model  with  reality."  In  attempting 
to  perform  this  third  stage,  Wright7  suggests  that  three  questions  be  considered: 

1.  how  do  we  intelligently  compare  simulation  output  data  with  empirical  data, 

2.  how  do  we  collect  and  exploit  the  empirical  data  used  in  our  tests,  and 

3.  how  do  we  transform  the  results  of  these  tests  into  a  confidence  in  the  computer 
simulation? 

Finally,  Baird,  et.  al.8  warn  that  the  empirical  data  used  for  comparison  with  the  simulation 
output  data  must  be  independent  of  those  used  in  building  the  computer  model;  otherwise, 
we  have  only  verification  of  the  simulation. 

Tytula9  has  divided  the  many  methods  used  for  the  data  comparison  into  five  general 
categories: 

1.  judgemental  comparison  -  this  method  seems  to  be  the  most  widely  used  and 
includes  graphical  analysis  and  the  comparison  of  common  properties  such  as  the 
mean  and  variance;  it  Ls  easy  to  use  and  quite  practical,  but  the  impact  of  errors 
in  judgement  is  difficult  to  assess, 

2.  hypothesis  testing  -  this  method  includes  goodness-of-fit  tests,  analysis-of- 
variance  techniques,  and  nonpar ametrk  ranking  methods;  since  this  will  be  the 
category  of  interest  in  our  report,  the  advantages  and  disadvantages  will  be 
discussed  in  the  succeeding  section, 

3.  spectral  analysis  -  since  the  output  of  many  simulation  models  is  in  the  form  of  a 
time  series,  this  method  is  particularly  useful;  however,  it  is  difficult  to  relate  the 
invalidity  at  a  particular  frequency  to  the  overall  simulation  validity, 


Wright,  K.D.,  •Validating  Dynamic  Modali:  An  Evaluation  of  Taiu  of  Predictive  Power,* 

Proceeding  of  tha  1972  Summar  Computer  Simulation  Confertnca,  1972, 

8  Baird,  AM.,  Goldman,  R.B.,  Bryan,  W.C,  Holt,  W.C,  and  Balroaa,  P.M.,  "Verification  and  Validation  of  RF-Eiivironmantal  Modali  • 
Methodology  Ovarvlaw,'  Boeing  Aerospace  Company,  1980, 

9 

Tytula,  T.P.,  *A  Method  tor  Validating  Miaaile  System  Simulation  Models,' Technical  Report  E-7B-U,  U.S,  Army  Mluile  Research  and 
Davalopmant  Command,  1978. 


4.  sensitivity  analysis  •  this  method  can  determine  a  range  of  parameter  values  and 
assumptions  over  which  the  simulation  is  valid,  but  it  is  usually  difficult  to  analyze 
the  effects  of  the  characteristics  drifting  outside  this  range,  and 

5.  indices  of  performance  -  this  method  is  useful  in  ranking  models;  however,  it  is 
impossible  to  pick  a  value  for  a  given  index  which  will  always  imply  a  valid 
simulation. 

Validation  is  a  difficult  process  because,  as  lytula9  {Mints  out,  no  single  satisfactory 
method  exists.  Most  techniques  are  problem  dependent;  and,  indeed,  the  output  data  of  a 
simulation  may  be  independent  or  correlated,  univariate  or  multivariate,  stationary  or 
dynamic.  In  fact,  Garrett10  states  that,  The  critical  dimension  affecting  the  applicability  of 
various  techniques  is  that  of  the  deterministic  or  stochastic  nature  of  the  output."  Only  a  few 
authors  have  attempted  to  provide  a  general  validation  technique  *  see  Gilmour11  for  an 
example.  Most  have  developed  methods  which  apply  to  a  select  subset  of  simulation  models; 
and,  even  then,  the  simulation  is  often  validated  only  for  a  particular  purpose  or  over  a 
particular  range  of  values.  In  that  case,  care  must  be  taken  not  to  apply  the  simulation  model 
outside  the  validated  region. 


HI.  VALIDATION  PROCEDURES 

In  this  paper  we  will  be  examining  hypothesis  testing  as  a  method  for  validating 
stochastic  computer  simulation  models.  This  type  of  procedure  allows  some  level  of 
confidence  to  be  attached  to  the  results.  When  employing  hypothesis  testing,  several 
assumptions  must  usually  be  stated;  but  by  using  nonparametric  ranking  techniques  we  will 
eliminate  one  major  (and  often  unjustifiable)  assumption  -  that  the  data  arise  from  a  normal 
distribution. 

Sargent12  notes  that  for  hypothesis  testing  we  generally  assume  a  null  hypothesis  that  the 
simulation  model  is  valid.  Then  by  establishing  a  level  of  confidence  for  a  particular 
statistical  test,  we  fix  the  probability  of  a  Type  I  error  in  which  we  reject  a  valid  model. 
However,  for  simulation  validation  it  is  more  important  to  minimize  the  probability  of  a  Type 
II  error,  that  is,  accepting  an  invalid  model.  The  magnitude  of  the  Type  II  error  can  be 
determined  by  the  power  function  of  the  statistical  test  where  the  power  is  the  probability  of 
rejecting  a  false  null  hypothesis.  For  a  fixed  sample  size  there  is  a  tradeoff  between  the  two 
error  types,  so  that  we  can  increase  the  power  at  the  expense  of  the  confidence  level. 
Unfortunately,  the  power  can  not  be  computed  against  an  alternative  hypothesis  as  general 
as,  "The  simulation  model  is  invalid";  and  therefore,  it  must  be  examined  against  an  array  of 
different  specific  alternative  hypotheses.  Nevertheless,  we  continue  to  search  for  powerful 
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statistical  tests  with  justifiable  assumptions  which  will  still  provide  acceptable  levels  of 
confidence. 

Let  X  ■  (xj,  Xj, xk)  be  a  vector  of  inputs  to  a  simulation  model,  and  let  y  be  an  output 
resulting  from  X.  Then  y  may  take  on  many  values  is  the  case  of  a  stochastic  model.  Let  z  be 
the  corresponding  value  from  the  real-world  process  given  the  same  input  vector.  In  general, 
y  will  not  be  equal  to,  z  since  X  contains  only  a  finite  number  of  input  variables;  ostensively, 
the  most  relevant  ones.  The  purpose  of  the  simulation  model  is  to  mimic  the  real-world 
process.  Thus,  in  attempting  to  validate  it,  we  compare  each  empirical  value  with  the 
corresponding  model  output  generated  under  the  same  conditions;  that  is,  the  same  values  for 
the  vector  X. 

Suppose  there  exist  N  pairs  of  data  (y.,  zt),  (y2,  z^),  .  .  .,  (yN,  2^,)  available  for 
comparison,  where  each  pair  corresponds  to  a  different  input  vector  and  where  each  yt  is  itself 
be  a  vector  of  values  from  a  stochastic  model.  Reynolds  and  Deaton13  note  that  because  each 
of  the  pairs  was  generated  under  different  conditions,  it  would  be  incorrect  to  pool  the  data 
and  proceed  with  the  testing  of  our  hypothesis.  Rather,  we  must  find  a  statistical  procedure 
which  examines  each  pair  individually  and  then  allows  for  the  combination  of  these  results 
into  one  overall  test  that  provides  reasonable  power.  With  this  as  our  goal,  we  propose  to  use 
a  nonparametric  statistical  procedures  •  a  process  which  combines  independent  cases  of  the 
Mann- Whitney  test. 

A  stochastic  model  provides  a  set  of  output  values  that,  for  each  given  set  of  input 
values,  occurs  with  a  certain  probability.  Mihram14  states  that  this "...  probability ...  serves  as 
a  measure  of  our  human  ignorance  of  the  actual  situation  and  its  implications."  Generally,  the 
behavior  of  the  system  is  too  complicated  to  include  all  of  the  appropriate  inputs  in  the 
computer  model.  Even  if  it  were  possible,  the  return  in  accuracy  provided  by  such 
thoroughness  may  be  small.  Refinement  of  a  computer  model  usually  leads  to  stochastic 
modeling;  and  because  of  the  abilities  of  today’s  computers,  the  use  of  such  modeling  has 
substantially  increased. 

Given  M  replications,  output  of  the  model  becomes  a  set  of  values  y1,  y2, ....  yM  for  each 
set  of  input  values  which  can  be  compared  with  (in  our  case)  a  single  corresponding  empirical 
value  z.  Recall  that  X  is  a  vector  of  most,  but  not  all,  of  the  relevant  input  variables.  Then  z, 
given  the  value  of  X,  Is  a  random  variable  reflecting  the  random  error  due  to  the  exclusion  of 
certain  factors  from  X.  Also  y,  of  course,  is  a  random  variable  since  the  simulation  model  is 
stochastic.  We  would  like  to  show  that  F(y|X),  the  conditional  distribution  function  of  y,  is 
equal  to  G(z|X),  the  conditional  distribution  function  of  z  for  all  •  00  <  y,  z  <  00  and  for  all 
X. 
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lUynoldi,  M.R.,  and  Deaton,  M.L,  •Comparisons  of  Soma  Tails  for  Validation  of  Stochastic  Simulation  Models, " 
Commun,  Statist.  ■  Simula,  Compute, .  Vol.ll  No.6, 1981 

14  Militant,  OA,  Simulation;  Statistical  Foundalloni  and  Mtihodolotv,  Academic  Praia,  Inc.,  1971 
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Considering  N  different  input  sets,  the  available  data  consist  of  N  observations 
(yj, yf, .... y.M, Zj),  (yi, y2, .... y2, Zj),  .  .  (y^. yZ, •«. yjf * *n)  of  multivariate  random 

variables,  where  the  r’s  for  any  given  observation  snare  a  common  distribution.  Mihram14 
suggests  ranking  yf,  yf , ...,  yf1,  zt  for  each  i;  if  the  model  is  valid,  we  would  expect  the  zt  to  fall 
somewhere  in  the  middle  of  such  a  ranking.  This  is  the  initial  step  in  a  procedure  known  as 
the  Mann- Whitney  test,  a  particular  case  in  which  one  of  the  random  variables,  namely  zt,  has 
a  sample  size  of  one.  Since  we  are  dealing  with  N  observations,  we  need  a  method  by  which 
we  can  combine  independent  cases  of  the  Mann- Whitney  test;  such  a  method  hu  been 
proposed  by  Van  Elteren5  and  referenced  in  a  very  clear  example  by  Reynolds,  et.al.,16. 

The  Mann- Whitney  test  is  a  hypothesis  test  involving  samples  from  two  distribution'*  that 
tests  for  equality  of  the  distributions.  For  each  input  set  X  a  sample  of  M  output  sets 
y1,  y2, ...,  y^  is  obtained  from  the  computer  simulation,  and  the  empirical  observation  z 
provides  another  sample  of  size  one.  The  following  three  assumptions  are  made: 

1)  both  samples  are  random  samples  from  their  respective  populations, 

2)  in  addition  to  independence  within  each  sample,  there  is  mutual  independence 
between  the  two  samples,  and 

3)  the  measurement  scale  is  at  least  ordinal. 

The  third  assumption  means  that  for  any  two  observations  on  the  random  variable  we  can 
distinguish  which  is  larger  and  which  is  smaller. 

The  null  hypothesis  is  that  F(y|X)  ■  G(z|X)  for  a  given  input  set  X.  When  we  combine 
N  of  these  tests,  in  the  manner  suggested  by  Van  Elteren,  we  have  the  null  hypothesis  of 
F(y|X)  ■  G(z|X)  for  all  -oo  <  y,  z  <  oo  and  for  all  X,  which  we  can  interpret  as, 'The 
simulation  model  is  valid."  Let  R(  be  the  rank  of  z{  in  the  ith  observation  (yt ,  y, , ...,  y,  ,  zt); 
thus,  Rj  is  an  integer  between  1  and  M  +  1.  Then  a  test  statistic  T  is  defined  as  the  sum  of 

the  R,’s  over  all  N  observations;  that  is,  T  »  Rj.  Very  high  or  very  low  values  of  T  will 

cause  rejection  of  the  null  hypothesis.  The  theory  behind  the  Mann- Whitney  test  is  given  in 
Conover7,  and  the  combination  of  such  tests  is  explained  by  Van  Elteren15. 

A  fourth  assumption  is  usually  made,  that  both  samples  consist  of  random  variables 
from  continuous  distributions.  This  is  to  assure  that  there  will  be  no  zeros  and,  more 
importantly,  no  ties.  However,  for  this  test,  a  moderate  number  of  ties  is  tolerable;  and  they 
are  handled  by  assigning  each  of  the  tied  values  the  average  of  the  ranks  normally  due  them. 
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Vin  Elt«ttn,P.,  "On  the  Combination  of  Indepandant  Two  Sampla  Tain  of  Wllcoxon," 

Bulletin  da  I'lnitltuta  International  da  SUtlHlquc,  37,  I960. 
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Reynold*,  M.R.,  Burkhart,  H.E,  and  Daniel*,  R.F.,  "Procedure*  for  5t*t!*tical  Validation  of  Stochattie  Simulation  Model*, " 
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As  mentioned  earlier,  a  misuse  of  hypothesis  testing  as  a  method  of  simulation 
validation  occurs  when  too  little  concern  is  shown  for  the  power  of  the  test.  The  power  is  the 
probability  of  rejecting  an  invalid  model,  and  we  would  like  this  probability  to  be  as  close  to 
one  as  possible.  Unfortunately,  the  power  can  be  calculated  only  for  specific  alternative 
hypotheses.  In  order  to  generate  power  curves  for  this  combination  of  Mann* Whitney  tests,  it 
is  convenient  to  make  one  additional,  albeit  restrictive,  assumption;  namely,  the  distribution 
of  the  y(’s  is  the  same  for  each  vector  of  input  values,  and  similarly  for  the  distribution  of  the 
2,’s.  Although  it  would  be  preferable  to  avoid  this  assumption,  it  is  necessary  in  order  to  test 
against  specific  alternative  hypotheses  -  in  this  case,  a  shift  in  the  mean. 

Figure  2  shows  some  power  curves  for  this  test  when  the  underlying  distributions  are 
normal  and  the  mean  of  the  distribution  of  the  z,’s  varies  from  zero.  Recall  that  a  true  null 
hypothesis  would  indicate  that  the  means  of  both  F  and  G  tend  to  be  equal  to  zero  .  These 
curves  were  generated  using  a  Monte-Carlo  procedure  which  incorporated  10,000 
replications.  Note  the  increase  in  power  as  the  number  of  observations  increases.  Figures  3- 
5  display  some  power  curves  for  other  alternative  hypotheses,  each  figure  assuming  a 
different  common  distribution  for  F  and  G  with  a  corresponding  modification  of  the  mean  of 
O.  Notice  when  the  abscissa  is  equal  to  zero  (when  the  null  hypothesis  is  true),  the 
probability  of  rejection  is  0.03  -  the  value  chosen  for  the  probability  of  a  Type  I  error.  The 
faster  the  curve  approaches  one,  the  more  powerful  the  test  against  that  particular  alternative 
hypothesis.  Although  very  narrow  in  their  scope,  these  results  do  provide  us  with  an 
indication  of  the  overall  power  of  the  test  against  a  shift  in  location  and  allow  us  to  determine 
the  extent  to  which  the  probability  of  a  Type  n  error  might  be  reduced  by  an  increase  in 
sample  size.  Reynolds  and  Deaton13  look  at  some  test  statistics  similar  to  T  designed  to.  be 
more  powerful  against  other  alternative  hypotheses. 


IV.  EXAMPLE 

The  Vulnerability  Analysis  for  Surface  Targets  (VAST)  model  is  a  computer  simulation 
currently  in  use  at  the  Ballistic  Research  Laboratory  to  evaluate  the  effect  of  kinetic  energy 
projectiles  or  shaped-charge  threats  against  a  single  surface  target.  It  incorporates  damage 
from  both  the  primary  penetrator  and  any  associated  spall  fragments;  but  currently  it  is 
unable  to  handle  damage  resulting  from  blast,  heat,  and  certain  synergistic  effects  such  as 
ricochets.  Furthermore,  there  is  a  variety  of  opinions,  estimates,  and  decisions,  all  based  on 
the  experience  of  the  vulnerability  analysts  but  generally  providing  vague  and  imprecise  data, 
which  subsequently  serve  as  input  to  the  simulation.  Nevertheless,  results  demonstrate 
reasonable  face  validity,  so  an  attempt  at  statistical  validation  of  the  model  seems  feasible. 

A  target  description  is  produced  by  a  separate  computer  code  using  a  combination  of 
geometric  figures  and,  once  generated,  can  be  viewed  from  any  orientation.  After  a  viewing 
angle  has  been  established,  a  rectangular  grid  is  superimposed  over  the  target  in  the  plane 
orthogonal  to  that  angle.  From  a  (uniform)  randomly-selected  point  within  each  grid  cell,  a 
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FIGURE  2:  POWER  OF  SH -LEVEL  TEST 

KQ:  F*G«NORMAL(0,i)  VS.  HI:  F*NnRMAI/0,l)»  G*NOP.MALC^O,l) 


FIGURE  3:  POWER  OF  5% -LEVEL  TEST 

HQ:  F~G-=UNIFORM(-l,l)  VS.  Hi:  F=UNIFORM(-l,l),  G=UNIFORM(a*-l,i) 
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FIGURE  4 :  POWER  OF  355-LEVEL  TEST 

HO:  F-G-CAUCHY(Q,1)  VS.  Hi:  F-CAUCHY(0,1),  G^\UCHY(a.^,i) 
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FIGURE  5:  POWER  OF  3F.-LEVEL  TEST 

HO:  F»G=L0GISTIC(0,1)  VS.  Hi:  F-LOGISTIC(0,l),  G=LUGISTIC(apO,l) 
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ray  is  traced  through  the  target;  and  a  list  is  constructed  of  all  components  encountered.  If  a 
spall-producing  component  is  encountered,  spall  rays  are  traced  from  that  point  of  impact  to 
ail  critical  components  in  the  target.  These  rays  represent  spall  fragments  whose  size,  shape, 
and  velocity  are  chosen  at  random  from  specified  distributions. 


Along  each  individual  ray,  residual  masses  and  velocities  of  the  primary  penetrator  und 
associated  spall  fragments  are  used  to  calculate  the  probability  of  incapacitation  for  each 
critical  component.  These  are  then  combined  over  all  critical  components  and  provide  a  loss 
of  function  (LOF)  for  the  particular  cell,  further  combined  over  all  cells  to  provide  a  LOF  for 
the  particular  orientation,  and  finally  combined  over  several  orientations  to  provide  an  overall 
LOF  for  the  target. 


Data  were  provided  by  vulnerability  assessors  who  had  estimated  loss  of  function  for  a 
particular  surface  target  based  on  their  inspection  of  actual  damage  from  a  particular  round 
of  ammunition  -  in  this  case,  the  function  evaluated  was  the  mobility  function.  When 
attempting  to  compare  model  output  with  this  empirical  data,  it  was  first  necessary  to 
determine  the  exact  point  of  impact  on  the  surface  target  during  the  live-fire  exercise.  Then 
the  VAST  model  assumed  that  point  of  impact  to  be  the  origin  of  the  ray  representing  the 
primary  penetrator.  Damage  due  to  that  ray  and  its  associated  spall  rays  were  then  combined 
to  provide  a  LOF  value  which  could  be  compared  with  the  empirical  datum  point.  Therefore, 
only  one  orientation  was  considered  and,  for  that  particular  orientation,  a  ray  originating  at  a 
specific  point  within  only  one  cell  was  examined.  Encountering  a  spall-producing  component 
still  required  a  random  selection  of  spall  characteristics;  and  because  execution  time  was 
reduced,  the  model  was  run  using  thirty  replications  -  the  output  data  appear  in  Table  1.  This 
output  from  the  thirty  two  replications  was  compared  with  the  empirical  data,  using  the 
method  proposed  for  stochastic  simulations. 


Table  2  contains  the  results.  Recall  that  R,  is  the  rank  of  zt  in  the  1th  observation 
(y,\  y, , ...,  y,M,  zk),  and  T  is  defined  as  the  sum  of  the  R,’s.  Under  the  null  hypothesis  of  a 
valid  model,  z1  has  the  same  distribution  as  y,\  yf, ....  y{M;  and  therefore,  Rt  is  uniformly 
distributed  over  the  values  1, 2, ...,  M  +  1.  Lehmann19  shows  how  to  establish  critical  values 
against  which  the  test  statistic  can  be  evaluated.  Modifying  his  results  by  incorporating  the 
number  of  tied  observations,  we  can  calculate  the  expectation  of  the  test  statistic, 

EP1--[N(M  +  2)1,  (1) 

2 


and  the  variance  of  the  test  statistic, 


Var  [T] 


12 


[N  M  (M  +  2)] 


1 

12  [M  +  1] 


I-1J-1 


(2) 


where  N  is  the  number  of  observations,  M  is  the  number  of  replications  of  the  model,  and  dy 
represents  the  number  of  tied  values  for  the  j,h  tie  in  the  ith  observation  with  n{  different  ties 
in  the  i,h  observatioa  Then  T*  ■  (T  -  E  [T])/War  [T]  will  have  approximately  a  standard 
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Shot  Number 


TABLE  1.  LOSS  OP  FUNCTION  VALUES  -  MOBILITY  KILL  (Cont’d) 


.7007  .6941  .7431  .4015  .4015  .4015  .4015  .9095  .8486  .4015  .4015  .6664  .6585 

.6474  .6182  .6344  .6707  .6490  .8091  .6772  .8318  .6339  .7693  .6497  .7537  .6812 


TABLE  2.  HYPOTHESIS  TEST 


Shot  Number 

Empirical  Value 

Rank  within 
Model  Values 

43 

.734 

16 

44 

.145 

11 

45 

1.000 

16 

46 

1.000 

16 

47 

.100 

8 

48 

.900 

27 

49 

.930 

31 

50 

1.000 

16 

51 

.145 

1 

52 

1.000 

16 

53 

.668 

27 

54 

1.000 

16 

55 

1.000 

31 

56 

.905 

31 

57 

.550 

11 

58 

1.000 

22.5 

59 

1.000 

24.5 

60 

.050 

1 

62 

1.000 

16.5 

'64 

.100 

13.5 

65 

1.000 

16 

66 

.668 

6 

67 

.953 

7.5 

68 

1.000 

31 

69 

1.000 

16 

70 

1.000 

24 

71 

1.000 

24.5 

72 

1.000 

30 

73 

1.000 

16 

74 

.905 

30 

75 

.668 

15 

76 

1,000 

16 

J2  Ranks  ■  584 

Critical  T- Values  (a  -  0.05)  -  435  (lower),  589  (upper) 
Critical  T- Values  (a  -  0.10)  ■  447  (lower),  577  (upper) 
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normal  distribution.  For  our  example  we  have  32  observations,  30  replications,  and  51 
instances  of  tied  values  with  varying  numbers  of  ties;  in  this  case  E  [T]  ■  512  and 
Var  1521.  We  can  calculate  critical  values  by  evaluating  the  equation  T  ■  39a  +  512, 
where  z  is  the  a/2  percentile  of  the  standard  normal  distribution.  As  shown  at  the  bottom  of 
Table  2,  there  is  insufficient  evidence  to  reject  the  null  hypothesis  at  an  a-level  of  0.05; 
however,  at  an  a-level  of  0.10,  the  null  hypothesis  would  be  rejected. 

Since  the  null  hypothesis  could  not  be  rejected  at  an  a-level  of  0.05,  we  must  be 
concerned  with  the  possibility  of  a  Type  II  error;  that  is,  accepting  an  invalid  model.  Figures 
2-5  demonstrate  the  power  of  these  tests  against  an  alternative  consisting  of  a  shift  in  the 
mean.  Figure  3  shows  that  the  power  of  this  test  is  very  good  if  F  (the  distribution  of  the 
model  output)  and  O  (the  distribution  of  the  empirical  data)  are  both  uniform.  However,  as 
seen  in  Figure  4,  if  F  and  G  are  both  Cauchy,  then  the  power  of  the  test  is  rather  poor. 

Reynolds  and  Deaton13  have  proposed  other  test  statistics  more  powerful  against 
different  alternatives;  but  for  the  loss  of  function  data  where  empirical  results  that  are  close 
to  the  value  one  tend  to  be  assigned  that  value,  a  shift  in  the  mean  seems  to  be  an  appropriate 
alternative  hypothesis.  Since  the  power  against  this  particular  alternative  is  fairly  good 
overall,  our  confidence  in  the  hypothesis  tests  tends  to  increase.  However,  we  would  like  to 
be  able  to  make  these  tests  and  other  tests  still  more  powerful  and,  in  the  future,  will  be 
exploring  methods  to  accomplish  this. 


V.  SUMMARY 

When  referring  to  computer  simulation  models,  a  few  authors  continue  to  use  the  words 
verification  and  validation  interchangeably;  however,  most  distinguish  between  the  two  terms. 
Verification  of  a  computer  model  assures  that  the  simulation  is  behaving  as  the  modeler 
intends,  while  validation  assures  that  the  simulation  is  behaving  as  the  real  world  does. 
Verification  is  the  process  of  debugging  a  computer  program;  validation  is  making  it 
consistent  with  reality. 

Prior  to  1967  very  little  was  written  concerning  the  validation  of  simulations;  but  much 
has  appeared  since  then,  and  there  has  been  general  agreement  on  several  points  •  the  most 
important  being  that  to  validate  a  computer  simulation  model,  empirical  observations  are 
necessary  and  statistical  tests  are  desirable.  All  validation  techniques  can  be  placed  into  one 
of  five  categories;  judgemental  comparisons,  hypothesis  testing,  spectral  analysis,  sensitivity 
analysis,  and  indices  of  performance. 

Nonparametric  ranking  techniques  are  one  class  of  statistical  hypothesis  tests.  We  have 
advocated  a  combination  of  independent  Mann- Whitney  tests  as  a  validation  procedure  for 
stochastic  simulation  models.  This  is  a  statistical  test  which  assesses  empirical  data  to  provide 
a  certain  level  of  confidence  in  the  computer  model.  The  main  disadvantage  is  the  same  as 
that  of  all  hypothesis  testing  techniques;  namely,  their  concern  for  protecting  against  Type  1 
errors,  sometimes  at  the  expense  of  Type  II  errors.  A  Type  I  error  results  in  rejecting  a  valid 
simulation  model  •  unfortunate,  but  not  as  potentially  dangerous  as  accepting  an  invalid 
simulation  model,  which  is  known  as  a  Type  II  error.  For  any  particular  test  we  can  get  an 
indication  of  the  probability  of  a  Type  II  error  by  generating  a  series  of  curves  that  will  allow 
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us  to  examine  the  power  of  the  test  against  various  alternatives. 

Power  is  defined  as  the  probability  of  rejecting  a  false  null  hypothesis,  and  we  would  like 
this  value  to  be  as  close  to  one  as  possible.  For  our  advocated  test  we  have  evaluated  the 
power  for  some  specific  alternative  hypotheses  by  incorporating  a  Monte-Carlo  procedure 
into  a  computer  program,  which  allowed  us  to  perform  thousands  of  replications.  Each 
replication  represents  a  case  in  which  the  alternative  hypothesis  was  true,  and  we  determined 
whether  or  not  the  test  rejected  the  null  hypothesis.  Obviously,  we  can  not  compute  power 
against  an  alternative  hypothesis  as  general  as,  The  simulation  model  is  invalid."  However, 
in  being  more  specific  we  are  forced  to  examine  an  array  of  different  alternative  hypotheses; 
and  while  a  test  may  be  powerful  against  a  subset  of  these  alternatives  (such  as  a  shift  in  the 
mean  of  a  distribution),  it  might  be  less  so  against  others.  The  most  we  can  hope  for  is 
reasonable  power  against  alternatives  important  to  a  particular  investigation.  The 
combination  of  independent  Mann- Whitney  tests  appears  to  have  reasonable  power  against  a 
shift  in  the  mean,  but  we  would  like  to  be  able  to  increase  it. 

For  any  given  alternative  hypothesis  there  are  several  ways  of  increasing  the  power.  One 
such  way  can  be  seen  in  Figures  2-5  •  increasing  the  number  of  observations.  Another  way  is 
to  reduce  the  level  of  confidence  in  the  test  itself;  that  is,  allow  the  probability  of  a  Type  I 
error  to  increase.  Because  of  the  importance  in  this  area  of  computer  simulation  validation, 
we  hope  to  develop  other  ways  to  make  these  tests  more  powerful  against  a  wide  range  of 
alternatives  while  still  permitting  them  to  provide  acceptable  levels  of  confidence  in  their 
results. 
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STATISTICAL  DECISION  CRITERIA  APPROPRIATE  FOR  SMALL  SAMPLES.  POST- 
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B.  H.  BISS  INCH 


INTRODUCTION .  The  calculation  of  variability  for  our  procurement  problem 
variable  la  of  cho  ucaoac  importance  Co  the  Navy  supply  ayatea.  After  all, 

It  la  pivotal  In  aetting  safety  lavtl.  It  appears  some  of  our  boat  savants 
have  taken  a  crack  at  this  and  tho  history  saams  to  point  out  chat  one  should 
distinguish  aaong  the  following: 

Mode la 

Mathematical  Statistics 

Approximations 

A  change  in  any  one  of  these  may,  and  apparently  does,  affect  tho  variance 
calculation. 

This  new  approaoh  avoids  the  problesw  others  have  run  into. 

In  the  appendices  are  fundamental  formulae,  a  careful  statistical  analy* 
sis  to  be  heedad,  and  a  history  of  those  attempts  to  solve  this  problem. 

THf  HCTPtl.  First,  let  us  look  at  a  simple,  but  typical,  constant  situs* 
tion.  Suppose: 

L  -  leadtiae  •  5  quarters 

TAT  -  turn* around* time  -  2  quarter* 

D  -  quarterly  demand  -  4  units 
B  -  regenerations  per  quarter  •  2  units 
Then  our  net  2  -  procurement  in  a  leadtime  is: 

2  -  (L)(D)  -  (L)(B)  ♦  (B)(TAT)  or 

-  D(TAT)  ♦  (D-B) (L*TAT) 

-  20  •  10  ♦  A  -  •  *  6  -  M 
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Han  Is  s  picture  drawn  by  CDR  L.  Atkinson: 


Lae's  look  st  a  similar  situation  vhsrs  L,  TAT,  and  D  ara  eha  same  but  &  is 
lncraasad  to  3.  Than  Z  «  20  •  lf’+6“84*3,»ll. 


A  similar  datarmlnistlo  portrayal  was  glvan  by  CDR  T.  Bunkar  as  follows: 


A 

M 

0 

u 

N 

T 


♦ 


L 


Thas a  mnamonlo  hauxiselo  diagrams  ars  fins  if  usad  proparly  to  sat  up  eha 
ralavant  indatarministio  axprasslons. 


TH1  KACHUfOY .  Lae  rt  -  racovary  rata  and  r,  -  rapalr  rata  so  chat 
rjrj  is  tha  pareantaga  (daclmal  equivalent)  of  raplanishaant,  and  hanca, 
l-r,r,  -  attrition  rata. 

From  tha  Just  discuaaad  and  picturad  procass  (modal)  wa  can  writa  tha 
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procurement  problem  varleble  as: 

Z  "  Z|  *  Zj  +  Z| 


where 


Z|  -  I  0 
l-l  1 


Z*  ■  *»**  £  ®i 
1-1 


z«  -  t|rt  £  Di 

l-l 


The  verlence  of  Z  1* 


<4  -  4  +  *\  *  «r*  -  2  COV  <VZa>  +  2  COV  (Z^)  •  2  COV  (Z^) 


Flrat  lee  us  compute  the  ehree  verlencee: 


E(Zx|L)  -  l ^  V<K<Za  JI.)>  - 

V(Z,|L)  -  LeJ  *<V(Zl|L»  -  ^ 

A  VA*  Zx  - 


Obviously,,  elnce  Z}  -  e  cone  tent  times  Z^ , 

1UL  Za  -  t*r|  <mJ  ej,  ^  e"). 

Also,  since  Zs  Is  ehe  seme  ss  Z^  sxcspe  for  T  replacing  L,  and  has  the  same 
constant  multiplier  as  Za, 

l,  - '!'!  ♦  Vt1 
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f  L  L 

Naxt  tha  2  COV  (Z,  ,Z,)  -  2  COV  £  D. ,  r  r  I  \ 
1  *  L  l-l  1  1  *  l-l  ■ 


- 1 v.  ’*»  [  °i] 


- 2 v,  <vo  *  <to.i 


r  l  t  >1 

Tha  2  COV  <Z,ZS>  -  2  COV  £  D  ,  tt  £  D. 

1  *  L  1-1  1  1  *  l-l 


2  V«  [*(?  °^(f  Di)  •  Wr] 


How  B 


W  [*  <D*>]  -  LT  +  4> 


Asaualng  L  and  T  art  Indapandanc,  wa  gat  eha  abova  eo  ba: 

<Vr  ('o  ♦ 

So  tha  2  GOf  (Z1,Z|)  baeoaas: 


2  v.  *  4>  •  vvt1 


- 2  w» 


Tha  third  covarlanca  can  follows  aaslly  fro*  cha  abova  and  wa  have 
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i  cm  (Zj . z# )  -  2  t|tj 


So,  combining  all  six  earns  va  gae: 


»*»  z  -  ♦  4'i.  ♦  'M  l*iA  *  vi1 


*  AA  [v4  *  <4*'*' 

• 1  V,  tvS  ♦ 

♦ 2  r,r,  I'W'o  *  <4>  •  w1 

•  * e,*.  ♦  <4>  - 

<!  "  r1r,)*  t^L#D  ♦  <4*1,1 


+  AA  >v#  *  <44> 

+  2  tVz»Jl 
- 4  r!rI  1<Wd> 
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va  ,  a  ^  2  a, 

-  <l  -  Va>  'Vo  +  Vl1 
*  «  I'V’D  +  vi' 

♦  2  rtea  (1  •  Vl>  ('Wd> 


This  lasc  formula  Is  a  model  builder's  dream.  It  has  highly  desirable 
properties.  First,  note  the  coefficients  add  up  to  unity, 

(1  -  tjt,)*  +  2  rirl  (1  -  rttt)  +  t\i\ 

-  td  -  V,)  +  -  1 

So  they  may  be  considered  weights  attaching  importance  to  the  factors  they 
multiply.  Next,  numerical  values  for  the  various  factors  are  easily  avail* 
able  and  anyone  can  easily  calculate  the  total  expression. 

Then  it  has  sort  of  a  group  symmetry  in  that  it  is  invariant  under  the 
transformation  sending  L  to  T,  T  to  L  and  r^  to  1  -  r^  and  vice  versa. 
Molecular  chemists  and  physicists  go  into  ecstasy  over  such  formulas  as  they 
say  it  shows  strength. 

Each  term  has  meaningful  sense  as  you' read  it.  There  is  a  fraction  of 
the  variance  of  leadtime  demand,  a  fraction  of  the  variance  of  turn-around* 
time  demand,  and  an  Interaction  term  to  make  up  the  rest. 

Let's  say  r^r^  -  .9  which  I  am  told  is  not  unrealistic.  We  get  back  into 
service  90%  of  what  we  bought  after  repairing.  Then  1  -  r^-  .1  and  our 
coefficients  become: 
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.01  on  variance  of  leadeiae  daaand 

.SI  on  variance  of  turn- around *clat  daaand 

.18  on  the  interaction  of  eha  abova  two 

le  makaa  aanaa  to  put  seat  of  your  weight  on  Chae  which  ia  noae  aeelva,  Tha 
lntaractlon  cam  can  ba  writcan  aa 

*  f'l'.'VD)  •  (l  -  Vi«V 0>l 


which  la  Ilka  an  aaaooiaclon  index. 

WHAM  Process  ahould  a  1  way  a  cobm  firsts.  Ilka  In  Managansne  Sclanca 
policy  ahould  praeada  proeadura.  Z  ova  ouch  thanks  to  J.  Boyarakl  who, 
afear  auffarlng  with  eha  historical  prasantaelona  aa  I  vane  through  than, 
lapvaaaad  aa  vleh  eha  Harkov  closad  loop  proeaaa  va  have  hara  and  aeraaaad 
tha  systaaa  anglnaarlng  aapaeea.  I  finally  gava  up  on  fiddling  vleh  vhae 
avarybody  alaa  had  dona  and  aeartad  froa  scratch.  Xe  looks  Ilka  le  paid  off. 

Finally  va  saa  this  la  a  erua  ganarallxatlon  of  eha  eonsuaabla  aodal  in 
ehae  If  r^  -  0;  i.a.,  no  rapalrablas,  va  find  eha  corrace  expression  for  a 
eonsuaabla . 
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aamnim 

Certain  random  variable  axpraaalona  aria*  in  the  computationa  for  the 
variance  of  the  procurement  problem  variable  regardleea  of  the  model.  Here 
we  five  ehem  and  their  variancea. 

1.  In  aaauminf  the  quarterly  demanda  are  l.i.d  «e  compute  the  variance 
of  the  random  variable  aum  of  them  aa 

T“  Ji  Di 

Otherwiae,  we  would  have  more  complicationa.  For  example,  if  we  aaavaed  that 
aucceealve  demanda  had  correlation  p,  then  an  additional  term  of  the  form 

MjO^-1)* 

would  appear,  thereby  inoreaainf  the  variance.  Ve  know  the  variance  of  a 
mean  of  correlated  variablea  cannot  be  driven  down  by  inereaalng  aample  aize. 
Aa  it  la,  ve  are  aaaumlng  L  and  D  era  independent. 

2.  For  any  two  random  variablea  x  and  y: 

«  <*-y)  -»■*,- v*  ♦  V  V, 

If  x  and  y  are  lndapendant,  thla  reducea  to 

K  (x*y)  - 

3.  For  any  two  random  variablea  x  and  y: 


I  0*  -  I  <*•*»* 

J  i  .  »  »  +  _a  a 

Vy  V»  +  Yy 

(1) 

*xf  '  2  *VV«y 

(2) 

coy  (x*,y*) 

(3> 
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For  jointly  normal  with  aero  means  COV  (x*,y3)  -  2  (E(xy)]3. 
Zf  x  and  y  art  independent  this  raducss  to 


a1  -  M*»*  +  M*  +  »*** 

*y  *  y  y  *  *  y 


4.  COV  ()ot,x)  -  k  (VAft(x) )  whara  k  is  a  constant. 


3.  COV  (x,  a-x)  -  -a*. 


<*> 


6,  Zn  tha  UICP  formulation  va  assuaa  tha  nuabar  off  units  daaandad  aach 
tiaa  period  (i)  ia  a  randoa  variable  which  is  dasoribad  by  a  fixed,  known 
frequeney  distribution  and  which  ia  not  autoeorralatad.  Also,  it's  assumed 
tha  ra turn* ffroa* repair  each  tiaa  period  (1)  is  a  randoa  variable  R^  which  is 
described  by  a  fixed,  known  frequency  distribution  and  which  dependa  on  (is 
correlated  to)  axaetly  one  observation  off  daaand,  naaely,  the  deaand  that 
occurred  a  set  turn* around* tiaa  (f)  prior;  i.e.,  We  run  into  the  eo* 

variance  off  and  R^.  To  aiatplify  it  we  further  assuaa  that: 


where  is  tha  return  rate  off  the  (i*T)*th  period  tiaas  the  survival  rats 
off  tha  l6*1  period.  Than  we  can  write: 


099  (D^  ll4t)  -  COV  <Dt,  P^)  -  I  (DjPjl^)  *  I  (D^  I  (P^) 
-  I  <P4>  1(0*)  •  I  (Pt)  [*(ot)i* 


-'•i 


where  va  further  aasuaed  P  and  0  are  independent  so  that 

1(1)  -  1(P)  1(D) 
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7.  NASA  uih  Ch«  following  approximation*  for  a* 


"i'y  *  “A  *  2  'VA'j 

♦  *****  +  ****)(!  ♦  a*) 
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dtrmua 

There  la  an  essential  point  to  bo  made  regardless  of  tho  aodol  uaod.  V • 
will  illuaerieo  it  by  considering  throo  difforont  oxprooiioni  which  oro  at- 
fobraieally  equivalent  in  deterministic  algebra  and  also  which  have  the  same 
first  moments  when  we  consider  the  symbols  to  be  random  variables  and  switch 
to  the  algebra  of  indeterminism.  However,  the  seeond  moments  are  not  neces¬ 
sarily  equal  and,  hence,  neither  are  the  variances  calculated  therefrom. 

First  consider  the  elementary  algebra  identity 

X-X-Q  (1) 

Nov  consider  the  related* in- form  random  variable  expression 

Xx  •  *t  (2) 

where  and  Xa  are  i.i.d.  The  mean  of  this  random  variable  expression  is  0, 
and  so  it  appears  therm  is  no  naad  to  distinguish  between  (1)  and  (2).  But 
the  varlanoa  of  (2)  is  2e*  while  the  variance  of  a  constant  like  0  is  0. 

Another  simple  example  comes  from  taking  X  +  X  •  2X  and  then  making  the 
variables  random  variables  which  leads  to  the  contradiction  2e*  -  4 o* . 

Why  all  this  vary  elementary  talk?  Well,  consider: 


Then  the  variance  of  this  is 


I  D..  t  Ot 
1.-1  1  i-i  1 


(3) 


VAX 


(4) 
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while  1C  vi  uae  dataralnlatic  algebra  flrae,  viz, 


w«  g«e 


(3) 


So  lee  ua  now  uonelder  ehree  dlfferene  expreeelone  thee  exlsc  in  different 
preaeneetlona  of  our  procurement  problea  verteble.  Theae  ehree  expreaalona 
ere  algebraically  equlvelene  in  daternlnlatlo  algebra.  Here  they  ere: 


L 

l 

1-1 


T 

l 

1-1 


00 


L-T 

-  E  D 

1-1 

oo 


t+T 


L 

l 

l-T+l 

(c) 


le  la  eeally  aeon  ehae  If  we  auddenly  make  0^,  L  end  T  random  verleblea  end 
eny  ewo  D^,  D ,  ere  1.1. d  end  L  end  ere  Independent  wleh  L  >  T,  ehen  the 
oeen  of  (e),  (b)  end  (o)  la 

(1  -  t)D 

But  the  varlaneeo  differ!  Lot  ua  develop  the  variance  of  (e) . 


Lae  T 


«ayt,T)  - -  <l  .  t)^ 

<I(T4|L.T)^  -  pj  .  a*L.T)  (6) 


Now  VAI  (TJL.T)  -  La*  ♦  Taj  (7) 

aaaualng  the  D^a  ere  1.1. d. 
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(S) 


Then  E(VAB  (TJL.t)  -  Le J  ♦  To* 

The  var lane*  of  Y  la  the  >ua  of  (6)  and  (8), 

VAt  <Yt)  -  m£*(L.T)  *  *D 

If  we  fusther  assuaa  L  and  T  art  lndapandanc,  ehan 

»*» «.)  -  <4  +  <vv  *d 

L-T 

How  abouc  (b) .  Lat  ^ 

i«l 

VAI  t,  -  <£.?)  ♦  5*l+I 

-  (L-T)  +  5J+I  f  *  «■  (10) 

Finally  the  variance  of  (c),  and  wa  call  YQ  •  (c) ,  la 

VAI  T0  -  LrJ  +  D*#*  ♦  <f  ♦  l)#J  ♦  D (U> 

<T) 

Tha  raadar  will  notice  several  ainilaritlea  and  disslallarltles.  Bafora 
chat,  X  call  attention  to  tha  quaatlon  aark  under  the  plua  algn  In  (11).  Soaa 
placaa  l  have  found  a  alnua  algn  hare  I  Tha  variance*  for  (a)  and  (c)  are 
alallar,  the  dlffaranea  being  alnor  and  depending  on  Integer  veraua  contin¬ 
uity  for  T.  Or  tha  other  hand,  tha  variance  of  (b)  not  only  haa  a  factor 
(L-T)  on  one  tan  aa  oppoaad  to  <L+T)  in  (a)  and  (c),  but  it  alao  has  an  In- 
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voLvtd  variance  Cara  which,  heretofore,  haa  baan  ayeteriouaiy  handled. 


Tha  point  is  that  (a)  and  ita  variance  ara  eha  correct  approach. 


I  refer 
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6mmu£ 

Back  u  far  aa  1963  whan  tha  PARS  vara  balng  writtan  (PAR  £  •  AppUeacton 
D,  Operation  6  (Lavala  Computation*  for  Rapalrablaa))  va  find  tha  foraula  for 
tha  varlanoa  of  attrition  daaand  |ivan  to  ba 

*D-rl  *  9l  *  *Im  "  2  <rB'D>  U) 

whara 

D  -  quarterly  daaand 
r  -  average  repair  aurvival  rata 
B  -  eareaaa  ratum  rata 

J 

rB  ia  brokan  down  into  tha  eorraot  thraa  taraa,  baaad  on  indapandanoa 
of  r  and  B,  naaely, 

?  *1  +  i1  a}  +  (2) 

(Saa  APPENDIX  A  •  fonula  <4>) . 

Purthar,  aaauaing  (a)  that  r  is  indapandant  of  B  and  D  and  (b)  that  tha 
RFI  raganarationa  for  a  glvan  quartar  are  a  function  of  daaand  froa  a  prior 
quarter,  tha  axpraaaion  (1)  raduoaa  to 

-  2  r  007  (1,0) 

and  quickly  la  addad 

coy  <»,»>  -  ’ 

Alao,  undar  tha  aaauaptiona: 
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(a)  Daaand  during  turn- around* tiaa  Is  indapandant  of  Attrition  during 
lAAdtiai  loss  turn- around* tiaa; 

(b)  Laadtiaa  And  turn- Around* tlao  art  indapandant, 

tha  covarlanca  of  daaand  during  proeurtaant  turn- around- tiaa  with  attrition 
during  laadtiaa  l«aa  turn-around- tiaa  la  glvan  to  bo 

•  COV  (DT, (D-rB)(L-T)  -  *aj  [5(5  -  r  B)] 

Finally,  tho  varlaneo  (V^)  of  th«  proeurtaant  problaa  varlabla  la  glvan  to  ba 

\  *  »Jt  ♦  •«..»)  (L-t>  <»-«*)  a-D  I 
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hnmu  a 

It  was  in  the  aid* 60s  when  ve  were  writing  the  PAHS  Chat  Fieir  Zehna 
turned  hie  attention  to  eeeounting  for  attrition  during  turn-around* time. 

He  agreed  with  the  others  that  wa  eatiaata  froa  past  history  recovery  rate 
and  repair  rate,  say  r.  and  r.  and  chan  R  •  1  •  r.  r.  is  tha  attrition  rate. 

Also,  wa  .all  assuoed  that  L  >  T  and  that  deaanda  are  mutually  independent. 

Initially  we  said  tha  proeuraaent  problaa  variable  2  is  to  account  for  ail 
of  tha  deaanda  during  a  laadtiaa  lass  tha  regenerations  during  chat  time.  It 
was  coaputad  by  accounting  for  the  deaanda  during  turn- around- tiaa  T  and  add¬ 
ing  the  attritions  during  laadtiaa  lass  turn- around- time .  Zehna  objected  on 
the  grounds  that  this  iaplicitly  assumed  that  regenerations  for  a  given  lead- 
eiaa  are  a  function  of  deaanda  during  the  laadtiaa  lass  turn- around- tiaa. ‘ 

Ha  proposed  what  ha  said  was  more  realistic  and  computationally  simpler. 

Ha  suggested  wa  assuaa  that  regenerations  are  a  function  of  tha  daaands  that 
occur  during  turn- around- tiaa  T.  These  regenerations  ara  available  for  issue 
during  the  laadtiaa  L  and  occur  at  a  rata  r^ .  Hence,  they  can  be  expressed 
as  the  random  variable 

I 

Vs  Di  (1) 


So  tha  precureaent  problaa  variable  can  be  written 
L  TIL 

*  •  Z  •  *.*,  t  D.  * 1  £  ®i  ♦  t  \ 

l-l  1  *  *  l-l  1  1-1  1  l-T*l  1 

Using  our  usual  foraula  for  tha  variance  of  the  random  sua  of  randoa 
daaands,  Zehna  obtained: 
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*’<Vd  *  +  +  *i> 


(3) 


-  *£  tM,.  •  (l-t1)^!  ♦  Mj  l<£  ♦  (14ft*  )ff*  ] 

In  our  usual  notation  for  saaqpls  astluatss  this  |iv*s, 


A 

o\  -  sj  [L  -  (1"1*)T]  +  D*  <Ut*)  a*] 


Lot's  hold  up  hsrs  s  ainuts  snd  go  back  and  rsvriea  (2)  as: 


Z 


I  L 

i  I  M  I  ■> 

i-l  1  i-l 


i 


<*) 


(3) 


(«) 


Than 


r  i 

■  1  r  J*  i  i 

‘  X  1 

VAt  Z  -  VAZ  It  I 

•  D,  1  +  VAt  I  I  0t|  +  VA1 

L  l-l  J  L  l-l  1 

r  T  Li 

-  i-l 

•f  2  00V  | 

1  I  Df  I 

L  i-l  1  i-l  J 

r  T  T  i 

-  2  COV 

*  I  J  DU 

L  i-l  i-l  J 

r  L  T  i 

•  2  COf 

I  Dt.  I  D 

L  i-l  1  i-l  J 

(7) 

(S) 


(9) 


(10) 


Lae  us  coapueo  first  tha  thrss  eovarianeas. 
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■  *  [*  ii K  A  DJ  •  *w4  m 

r  T  T  , 

ll  l  D  l  D. 

1  i-i  1  1-1 

I 

•  -  R  VA1  J  Dt 

"  *  ♦  fi'ri  (9a> 

[I*  T 

l  I  #. 

l-l  1  1-1  ‘-1 

'  1  f  Ji  °i  *  &  BJ  ‘  w#  do.) 

Now  lot'*  eoablna  (8),  (9),  and  (10)  aa  (8)  -  (9)  -  (10). 


I 


*wi 


*  a -Miyyi  •  *Vd  *  *vi 


-  0 


* *<  vi  ♦  4*i> 


ft  VAft 


T 

1 


1 


whara  va  aaaunad  T  and  L  lndapandane. 
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So  altogether,  by  correct  aatheaatical  » tat 1* tic*,  we  obtain 

»***-*  i  V*D  ♦  441  ♦  +  4.*  ♦  V'D  ♦  44 

•  »  <Vn  ♦  44> 

-  (**-awi)  IV*D  ♦  441  *  Vo  ♦  <4*1 

Separating  thl*  into  two  tern*,  on*  on  and  on*  on  a*  Zehna  did, 
yield* 

VAR  Z  -  [(R*-2R+l)<ar  +  ^ ]<£  +  +  (R1 -21+1)0*]^ 

-  tMi  *  <l-D%lo|  +  [*l  +  <l-R>*o*  ]*£ 

We  note  thla  la  very  slallar  to  Zehna' •  reeule  (4),  the  difference  lie* 
in  the  coefficient* 

(1-1)*  v*  1-1* 
and 

(1-1)*  t»  let* 

So  we  see  that  Zehna'*  coefficient  on  t*  negative  and,  hence,  aakee 
a  aaaller  coefficient.  On  th*  other  hand,  hi*  coefficient  on  MqOj  i*  larger 
by  2R. 

In  1964,  J.  V.  Prichard  of  BUSANDA  Navy  Headquarter*  (today  NAV3UF)  pre¬ 
sented  a  paper  entitled  inventory  Model  for  Repairable  Iten*  •  Theory  and 
Practices . " 
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H«  l*e  t%  •  OL  •  BL  ♦  IT  -  DT  ♦  (D*B)(L*T)  b*  eh*  randoti  variable  of  eh* 
amount  of  aatorlal  dsaandad  In  a  l«adeiac  by  ua*ra  and  by  eh*  r*pair  procaaa, 
but  noe  aaeiafi*d  by  eh*  rapair  proeasa.  Th*  varlane*  of  ZJ  b*eoa*s; 

VAt  ♦  **<D.g) (L.T)  ♦  2  COV  (DT,  (D-B)(L.T)1 

Th*  laae  tw»a,  which  la  -2*^  [D(D-i)),  la  n**d*d  bscauas  of  eh*  obvious  eorr*< 

laelon  b«ew*«n  groaa  daaand  during  a  turn* around- tia*  and  eh*  n*e  daaand  eo  b* 
a*e  froa  purchaa*  during  ehae  poreion  of  eh*  proeur*a*ne  Uadeiaa  in  *xe*aa  of 
eh*  eum- around*  ela*. 

Th*  oeh«r  two  earn*  in  th*  *xpr**alon  for  VAR  04kn  b*  *xpand*d  lneo 
eh*  fora  for  aua«  ov*r  a  randoa  interval  of  randoa  daaand* ,  vis 

4t  -  *4  ♦ 

*<D-»)<L.t)  ■  l»i  +  »1  ♦  i  cat  (D,»)I 
+  <D-I>*  |.J  +  »*] 

Th*  eovarlane*  earn  2  00V  (D,»)  1*  approximately  equal  eo 


So  v«  *aA  up  with 

VAR  -  Taj  (»)*  a*  ♦  <L-f)  [a*  ♦  a*  -  2  -j"  aj  ] 


♦  <D-I>*  [aj  ♦  a*  ]  *  2  5<5-i)a* 
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Hoe*  that  B  la  uaad  hart  for  rB  In  tha  PARS  axaopla.  Thla  uaa  approach  and 
raaulea  vara  uaad  by  J .  Schnalkar  • 

Hara  la  a  davalopaant  by  CDR  Kaleh  Llppart  without  tha  covarlanca  tarm: 
VAR  (DxT  <D-rB')(L-T)l  -  V(DxT)  +  V(D*rB» )(L-T) 

-  5*«J  ♦  f»J  +  (D-rI>)*»J,.t  +  <E-T).J.tJ, 

-  5*.*  +  T»J  ♦  (O-t*')'  +  a-I)»J|.cS, 

a  9*,  aaaualng  lndapandanca . 

VrB'  D 


*rB' 


<rr  >*  t  <r)f  <B*  )d*d»* 

(r1!*1-  aJ#>f<»')f<r)drdB’ 


-J  r*l(B’)*f(r)4r  -  J  2rjy£,f(r)dr  +  |  j£j£,f(r)ds 
-  I  (r1)  I  <»'  *)  -  u'A.  ♦  fa 


-  <aj  +  4  *»»>  * 

-  4-  **r*,  (thla  followa  froa  our  APPENDIX  A  aquation  (3). 
So  In  total  for  Z  -  procuraaant  problaa  varlabla 
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V«)  -  o’.’  +  f.J  ♦  {D-r»')'(.’  ♦  .’)  +  <i-f)  (.* 
Conpvasalng  tB'  into  • imply  B,  thia  bacoaaa 


SIGMA  VS.  Dbar 

(Original  Sigmas) 
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Variables  Constant  Except  Vi 


SIGMA  VS.  Dbar 

(Proposed  Methods) 


Variables  Constant  Except  Vd. 


SIGMA  VS.  Dbar 

(Bissinger  Method) 
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VcDtabl0i  Constant  Except  Yd. 


LATIN  HYPERCUBE  SAMPLING: 

A  WAY  OF  SAVING  COMPUTER  RUNS 


W.  J.  Conover 

College  of  Business  Administration 
Texas  Tech  University 
Lubbock,  Texas  79424 


ABSTRACT.  When  real-life  situations  are  modeled  using  a  computer 
program,  the  computer  program  is  frequently  very  large  and  takes  a  long  time 
to  make  each  run.  In  order  to  get  the  most  Information  from  a  limited  number 
of  computer  runs,  latln  hypercube  sampling  was  Invented.  The  wide-spread 
usage  of  latln  hypercube  sampling  attests  to  its  value  in  producing  precise 
estimates  of  the  output  distribution  parameters.  In  addition,  a  useful 
method  for  inducing  correlations  among  the  input  variables  in  simulations 
is  discussed. 


MICTION .  The  advent  of  high* speed  computers  has  opened  new 
doors  for  solving  difficult  real-world  problems.  Computer  codes  are  written 
to  simulate  the  behavior  of  the  real-world  situation,  and  then  the  codes  are 
run  repeatedly  on  the  computer  to  estimate  the  outcome  under  various 
different  circumstances,  where  those  circumstances  are  used  as  Inputs  to  the 
computer  code,  Unfortunately,  these  computer  codes  often  become  very  complex 
in  an  attempt  to  make  the  codes  as  realistic  as  possible,  and  as  a  result 
they  take  so  long  to  run  on  the  computer  that  the  number  of  runs  is  limited 
by  time  and  money  constraints.  Also,  computer  codes  become  more  complex  when 
the  number  of  different  input  variables  inoreases. 


Thus  the  following  situation  often  arises.  A  complex  computer  code  is 
written  that  mimics  the  real  life  situation  as  well  as  one  can  expect  from 
any  computer  code.  It  contains  many,  perhaps  hundreds,  input  variables  or 
parameters  that  can  be  varied  to  represent  different  circumstances  that 
should  be  considered,  and  it  takes  so  long  to  run  on  the  computer  that  only 
a  few  simulation  runs  (say  20  to  100)  are  possible  due  to  time  and  money 
constraints. 


How  is  this  possible?  In  everyone's  mind  there's  the  feeling  that  the 
number  of  runs  must  be  larger  than  the  number  of  variables.  However,  that 
notion  comes  from  solving  systems  of  linear  equations,  and  does  not  apply 
to  computer  runs.  For  example,  one  could  simply  choose  a  likely  value  for 
each  of  the  k  input  variables,  and  make  a  single  computer  run  using  these 
values.  Then  one  could  use  a  different  set  of  values  for  the  input 
variables,  perhaps  representing  a  possible  undesirable  scenario,  and  make 
a  second  run  on  the  computer.  So  k>  the  number  of  input  variables,  can  be 
much  larger  than  n>  the  number  of  runs. 

The  question  then  becomes,  how  should  the  various  values  of  the  input 
variables  be  selected  so  as  to  get  the  most  information,  in  some  sense,  out 
of  a  limited  number  of  runs?  One  approach  is  the  deterministic  approach, 
which  says  to  select  particular  sets  of  values  of  the  input  variables  that 
you,  or  someone  else,  want  to  examine  for  one  reason  or  another.  The  output 
of  the  computer  code  then  applies  to  the  scenarios  represented  by  those  sets 
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of  input  values .  Thers  art  obvious  advantages  to  this  approach,  but  tha  main 
disadvantage  is  that  few,  if  any,  probability  statements  can  be  made,  and 
often  any  kind  of  post-hoe  analysis  is  vary  limited. 

A  second  approach  is  to  use  a  monta  carlo  approach,  and  randomly  select 
values  for  each  input  variable ,  one  value  at  a  time ,  and  do  the  same  for  all 
input  variables.  This  assumes  that  each  input  variable  has  a  known 
probability  distribution  so  a  random  selection  may  be  made.  Then  the  output 
is  one  random  value  of  the  output.  By  repeating  the  procedure  several  tints, 
several  Independent  random  observations  are  made  on  the  output,  and 
estimates  of  the  output  probability  distribution  can  be  made.  This  method 
is  called  random  sampling.  It  allows  for  many  different  types  of  probability 
statements  on  the  output,  or  concerning  the  relative  importance  of  the 
various  input  variables. 

A  third  approach,  called  latln  hypercube  sampling,  is  discussed  in  this 
paper.  It  has  been  used  for  at  least  ten  years  by  several  national  research 
laboratories,  notably  Los  Alamos  National  Laboratories  and  Sandia 
Laboratories.  It  is  used  in  at  least  22  different  countries  for  selecting 
input  variables  in  long-running  computer  codes,  primarily  for  modeling 
nuclear  reactor  behavior,  and  the  behavior  of  deep  underground  nuclear  waste 
repositories.  Inquiries  regarding  a  computer  code  that  facilitates  its  usage 
should  be  addressed  to  Dr.  Ronald  L.  Iman,  Sandia  Laboratories,  Albuquerque, 
(505)844-8834,  who  has  gone  out  of  his  way  in  the  past  to  make  this  program 
available  to  prospective  users. 

The  popularity  of  latin  hypercube  sampling  la  due  to  its  characteristic 
of  having  a  relatively  small  variance,  as  compared  with  random  sampling  for 
example,  in  the  estimates  of  the  output  distribution.  Thus  the  same  types 
of  probability  statements  available  from  random  sampling  are  also  available 
using  latln  hypercube  sampling,  but  usually  with  much  more  precision. 

2.  LATIN  HYPERCUBE  SAMPLING.  One  characteristic  of  most  computer- 
coded  models  with  many  input  variables  is  chat  some  input  variables  are  more 
influential  than  others  in  affecting  the  outcome.  We  would  concentrate  our 
attention  on  the  more  Influential  input  variables,  if  only  we  knew  which 
ones  they  were.  But  that  is  often  the  purpose  of  the  simulation,  to  find  out 
which  input  variables  are  the  most  influential  on  the  outcome. 

If  we  knew  that  the  outcome  was  almost  entirely  dependent  on  one  input 
variable,  say  Xj_,  then  we  would  almost  certainly  want  to  select  values  of 
Xi  that  span  its  entire  range.  In  this  way  we  could  see  how  the  outcome 
varies  over  the  entire  range  of  values  of  Xj_,  and  we  would  have  a  complete 
picture  of  the  model's  behavior.  If  we  were  allowed  to  make  q  runs  on  the 
computer,  we  could  divide  the  range  of  Xi  into  n  Intervals  of  equal  length 
and  select  one  value  from  each  interval  for  each  run.  Some  of  the  Intervals 
may  be  very  unlikely  to  experience  in  real  life,  however,  and  besides  that, 
what  do  we  do  if  the  range  of  Xj_  is  Infinite?  So  it  makes  more  sense  to 
divide  the  range  of  X^  into  n  Intervals  of  equal  probability,  rather  than 
of  equal  length,  and  randomly  sample  one  value  from  each  interval.  Thus  all 
of  the  n  values  of  Xi  carry  the  same  weight,  and  no  problem  arises  if  the 
range  of  X^  is  infinite. 
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The  problem  la  that  va  don't  know,  bafora  running  the  coda,  which 
variable  la  the  moat  Important.  Furthermore,  In  many  altuatlona  there  la 
more  than  one  output  from  the  model,  and  while  X^  may  be  the  moat  Important 
input  variable  for  output  Y]_,  aay,  another  input  variable  X?  may  be  the  moat 
influential  input  variable  for  another  output  Yo,  aay.  Or  if  the  output  la 
a  function  of  time,  one  input  variable  may  be  tne  moat  influential  one  at 
an  early  point  in  time,  while  another  one  may  be  the  moat  influential  one 
at  a  later  point  in  time.  In  fact  this  la  the  rule  more  than  the  exception. 
How  do  we  handle  thla  aituation? 

One  obvioua  aolution  la  to  treat  both  X^  and  X2  with  equal 
conaideratlon.  Stratify  over  the  entire  range  of  X^  to  obtain  the  a  valuea 
of  a a  deacrlbed  above,  and  in  a  similar  manner  atratlfy  over  the  entire 
range  of  X2  to  obtain  the  a  valuea  of  <2  for  the  a  computer  rune.  Then  how 
do  we  decide  which  valuea  of  X^  to  pair  with  the  valuea  of  X2  in  the  varloua 
computer  runa?  The  approach  uaed  in  this  aactlon  is  simply  to  pair  them  in 
a  random  manner,  aa  variables  would  be  paired  in  real  life  if  they  were 
independent  of  each  other.  In  the  next  section  a  method  of  pairing  la 
discussed,  to  achieve  a  desired  correlation  between  X^  and  X2.  But  for  now, 
random  pairing  la  uaed. 

Of  course  it  now  becomes  obvious  what  to  do  if  a  third  input  variable 
X3  la  alao  important.  Stratify  over  the  entire  range  of  X3  to  get  the  a 
input  valuea  for  X3 ,  and  do  a  random  permutation  of  those  a  values  to  match 
them  with  the  (Xi ,  X2)  pairs  already  established.  A  similar  treatment  can 
be  made  of  all  or  the  input  variables.  In  that  way  if  one  of  them  turns  out 
to  be  very  important,  it  haa  been  treated  with  importance  by  stratifying 
over  its  entire  range.  If  it  turns  out  that  one  of  the  input  variables  la 
of  little  or  no  importance  in  influencing  the  output,  nothing  is  loat  using 
this  procedure  since  all  of  the  influential  input  variables  are  stratified 
over  their  entire  range.  Including  this  unimportant  variable  neither  aids 
nor  inhibits  the  amount  of  information  obtained  from  the  other  variables. 

Intuitively  this  seems  like  an  efficient  method  for  getting  the  most 
information  out  of  a  limited  number  of  computer  runa,  but  how  good  is  it 
really?  In  an  attempt  to  answer  this  question  several  different  sampling 
plans  were  compared  using  real  computer  codes,  by  McKay,  Conover  and  Beckman 
(1979),  Iman,  Conover  and  Campbell  (1980)  and  Iman  and  Conover  (1980).  In 
all  cases  the  output  parameters  were  estimated  with  much  more  precision 
using  latln  hypercube  sampling  than  with  any  of  the  other  procedures 
examined,  and  the  improvement  was  dramatic.  This  does  not  imply  that  there 
are  not  better  methods  for  selecting  input  variable,  or  that  this  same 
dramatic  improvement  will  be  evident  for  all  types  of  computer  codos.  It  was 
true  for  the  codes  we  examined,  when  compared  with  random  sampling  and  a 
different  form  of  stratified  sampling. 

One  disadvantage  of  latin  hypercube  sampling  is  that  even  though  the 
estimates  are  very  precise,  no  measure  of  the  precision  is  available,  as  it 
is  whan  using  random  sampling.  The  solution  to  this  problem  lies  in 
replicating  a  latln  hypercube  sample  several  times.  For  example,  if  a  total 
of  100  runs  is  allowed  on  the  computer,  first  use  10  runs,  or  20  runs  if  you 
prefer,  for  a  latin  hypercube  sample,  where  each  variable  is  stratified  over 
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10  (or  20)  Interval*.  Then  repeat  the  procedure  for  another  10  runs,  again 
stratifying  over  10  Intervals  for  each  variable,  but  of  course  the 
individual  values  are  unlikely  to  be  the  same  as  before,  and  the  random 
matching  of  one  variable  with  another  is  unlikely  to  be  the  same  as  before. 
By  repeating  this  procedure  until  the  total  number  of  runs  is  exhausted, 
several  independent  estimates  of  the  output  axe  obtained,  where  each 
estimate  has  the  preoislon  one  can  expect  from  latin  hypercube  sampling,  and 
the  group  of  estimates  together  provide  an  estimate  of  that  precision.  This 
variation  of  latin  hypercube  sampling  is  explored  by  Iman  and  Conover 
(1980),  and  as  one  would  expect  some  precision  is  lost  by  this  combination 
of  latin  hypereube  sampling  and  random  sampling,  but  the  benefit  is  in 
obtaining  a  measure  of  the  preoislon  in  the  form  of  a  standard  deviation  of 
the  estimate.  The  new  level  of  precision  is  somewhere  between  pure  latin 
hyperoube  sampling  and  pure  random  sampling. 

3.  CORRELATING  THE  INPUT  VARIABLES.  Thus  far  it  has  been  tacitly 
assumed  that  the  input  variables  axe  mutually  Independent,  and  therefore  the 
population  correlation  matrix  is  the  identity  matrix  I. 
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The  sample  correlation  matrix,  the  matrix  of  sample  correlation  coefficients 
representing  the  actual  correlation  of  the  selected  input  values  for  the 
various  input  variables,  will  be  close  to  1,  with  differences  due  solely  to 
sampling  variability. 

Often  the  input  variables  in  a  computer  code  represent  variables  which 
in  real  life  are  correlated.  If  the  input  variables  in  the  computer  code  had 
a  sample  correlation  close  to  the  real  correlation  between  those  variables, 
the  result  would  be  a  more  realistic  simulation,  with  more  believable 
results.  How  can  we  match  the  input  variables  so  that  the  matching  is  no 
longer  random,  but  rather  contrived  to  achieve  a  target  correlation?  The 
method  described  in  this  section  shows  how  to  achieve  a  target  rank 
correlation,  which  may  be  the  closest  we  can  come  to  achieving  a  target 
correlation  due  to  the  possibility  of  long- tailed  input  distributions  where 
outlying  observations  dominate  the  regular  correlation  coefficient,  but  have 
minimal  affect  on  the  rank  correlation  coefficient.  Recall,  the  rank 
correlation  coefficient,  called  Spearman's  correlation  coefficient,  is  Just 
the  regular  correlation  coefficient  computed  on  the  ranks  of  the 
observations.  See  Conover  (1980)  for  a  complete  description  of  rank 
correlation. 

An  example  can  help  describe  the  concept.  Suppose  n  -  15  runs  are 
authorized  on  a  model  with  k  -  6  Input  variables.  Three  of  the  input 
variables  are  mutually  Independent,  and  the  other  three  are  highly 
correlated.  The  population  correlation  matrix  C  looks  like  this. 
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10  (or  20)  intervals .  Than  repeat  tha  procadura  for  another  10  runs,  again 
stratifying  ovar  10  intarvals  for  aach  variabla,  but  of  course  tha 
individual  values  are  unlikely  to  ba  the  sane  aa  before,  and  tha  random 
matching  of  one  variabla  with  another  is  unlikely  to  be  tha  same  as  before. 
By  repeating  this  procedure  until  the  total  number  of  runs  is  exhausted, 
several  Independent  estimates  of  the  output  are  obtained,  where  each 
estimate  has  the  precision  one  can  expect  from  latln  hypercube  sampling,  and 
the  group  of  estimates  together  provide  an  estimate  of  that  precision.  This 
variation  of  latin  hypercube  sampling  is  explored  by  Iman  and  Conover 
(1980),  and  as  one  would  expect  some  precision  is  lost  by  this  combination 
of  latln  hypercube  sampling  and  random  sampling,  but  the  benefit  is  in 
obtaining  a  measure  of  the  precision  in  the  form  of  a  standard  deviation  of 
the  estimate.  The  new  level  of  precision  is  somewhere  between  pure  latln 
hypercube  sampling  and  pure  random  sampling. 

1* _ CQMUBIAIlfflL  ■  Thus  far  it  has  been  tacitly 

assumed  that  the  input  variables  are  mutually  Independent,  and  therefore  the 
population  correlation  matrix  is  the  Identity  matrix  I. 
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The  sample  correlation  matrix,  the  matrix  of  sample  correlation  coefficients 
representing  the  actual  correlation  of  the  selected  input  values  for  tha 
various  input  variables,  will  be  close  to  I,  with  differences  due  solely  to 
sampling  variability. 

Often  the  input  variables  in  a  computer  code  represent  variables  which 
in  real  life  are  correlated.  Zf  the  input  variables  in  tha  computer  code  had 
a  sample  correlation  close  to  the  real  correlation  between  those  variables , 
the  result  would  be  a  more  realistic  simulation,  with  more  believable 
results.  How  can  we  match  the  input  variables  so  that  the  matching  is  no 
longer  random,  but  rather  contrived  to  achieve  a  target  correlation?  The 
method  described  in  this  section  shows  how  to  achieve  a  target  rank 
correlation,  which  may  be  the  closest  we  can  come  to  achieving  a  target 
correlation  due  to  the  possibility  of  long* tailed  input  distributions  where 
outlying  observations  dominate  the  regular  correlation  coefficient,  but  have 
minimal  effect  on  the  rank  correlation  coefficient.  Recall,  the  rank 
correlation  coefficient,  called  Spearman* s  correlation  coefficient,  is  Just 
the  regular  correlation  coefficient  computed  on  the  ranks  of  the 
observations.  See  Conover  (1980)  for  a  complete  description  of  rank 
correlation. 

An  example  can  help  describe  the  concept.  Suppose  n  -  15  runs  are 
authorized  on  a  model  with  k  -  6  input  variables.  Three  of  the  input 
variables  are  mutually  Independent,  and  the  other  three  are  highly 
correlated.  The  population  correlation  matrix  C  looks  like  this. 
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Bach  input  variable  has  15  valuaa,  obtalnad  by  using  fhs  stratification 
procedure  dascribad  for  latin  hyparcuba  samples ,  If  tha  13  valuas  for  aach 
input  variable  ara  permuted  randomly  tha  sample  correlation  matrix  might 
look  like  this. 
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X1 
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Tha  matrix  T  shows  how  random  correlations  may  differ  from  tha  target 
value  of  zero,  and  sometimes  tha  difference  is  fairly  large.  In  this  case 
tha  targat  correlations  ara  given  in  the  matrix  C.  How  can  one  obtain 
correlations,  albeit  rank  correlations,  close  to  the  ones  in  C? 


If  the  values  of  the  input  variables  are  permuted  so  that  their 
rankings  agree  with  the  following  rankinge,  then  their  rank  correlation 
coefficients  will  be  given  by  the  rank  correlation  matrix  M,  given  below. 
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Note  how  close  the  rank,  correlation  are  to  the  target  correlation 
given  above  in  the  matrix  C.  Even  the  correlation  aiming  at  the  value  zero 
coma  much  closer  to  aero  than  the  random  correlations  in  the  matrix  T.  Thus 
even  if  the  input  variables  are  independent,  one  may  prefer  to  use  this 
procedure  to  obtain  nearly  orthogonal  (in  the  sense  cf  ranks)  input  vectors, 
rather  than  relying  on  random  matching  which  may  produce,  by  chance, 
correlations  quite  far  from  the  target  values  of  sero,  as  shown  in  the 
matrix  T. 

It  is  necessary  for  the  number  of  runs  n  to  be  larger  than  the  number 
of  variables  Is*  for  which  correlations  are  being  designated,  in  order  to  use 
this  procedure.  Mots  that  fc*  may  be  less  than  the  total  number  of  variables 
k 

One  advantage  of  using  the  rank  correlation  coefficients  becomes 
apparent.  The  ranks ,  when  paired  as  thay  are  above,  always  result  in  the 
rank  correlation  matrix  M,  no  matter  what  the  original  numbers  are,  and 
therefore  no  matter  what  the  marginal  distributions  might  be,  Thus  this 
method  of  inducing  rank  correlations  is  free  of  any  distributional 
assumptions  regarding  the  input  variables. 

Although  we  are  using  this  method  of  inducing  correlations  in 
conjunction  with  latin  hypercube  samples,  it  is  in  no  way  tied  to  latln 
hypercube  sampling.  It  works  equally  well  with  random  sampling,  or  any  other 
way  of  obtaining  values  for  the  input  variables .  All  that  is  required  is  a 
rearrangement  of  the  input  values  so  that  their  ranks  agree  with  a 
prescribed  set  of  ranks ,  in  order  to  obtain  a  rank  correlation  matrix  close 
to  the  target  rank  correlation  matrix. 

Of  course  the  big  question  is,  how  does  one  obtain  the  prescribed  set 
of  rankings  for  any  given  rank  correlation  matrix,  as  given  above  for  the 
matrix  M?  As  you  would  expect,  the  method  is  not  simple.  It  can  be  done  by 
hand,  but  the  Sandla  computer  program  is  recommended  for  convenience,  since 
it  takes  the  difficulty  out  of  the  procedure.  For  those  who  are  not  afraid 
of  matrix  manipulation,  the  procedure  is  as  follows. 

1.  Start  with  any  set  of  n  numbers,  called  scores,  where  n  la  the 
number  of  runs.  We  usually  use  normal  scores,  which  are  the  i/(n+l) 

quantiles  from  a  standard  normal  distribution,  i  -  1 . .  which  are 

readily  available  from  any  table  of  the  standard  normal  distribution 
such  as  that  in  Conover  <1980) ,  Denote  chose  scoros  by  a(l) ,  , . , ,  a(n) . 

2.  Form  a  matrix  R  with  k*  columns  in  it,  where  each  column  contains 
a  random  permutation  of  the  n  scoros,  and  where  k*  represents  the 
number  of  input  variables  being  correlated.  Be  sure  all  permutations 
are  distinct. 

3.  Find  the  sample  correlation  matrix  T  of  R.  Mote  that  T  is  the 
regular  correlation  matrix,  not  the  rank  correlation  matrix.  However 
it  Is  a  characteristic  of  normal  scores,  and  normal  random  variables, 
that  regular  correlation  coefficients  and  rank  correlation  coefficients 
are  usually  quite  similar. 


286 


4.  Find  «  matrix  Q  such  that  QQ’  -  T,  where  Q'  denotes  the  transpose 
of  Q.  Mathematicians  have  devised  several  methods  for  finding  Q.  The 
one  that  we  use  la  tho  Cholesky  factorization  scheme,  which  results  in 
a  lower  triangular  matrix  for  Q. 

5.  Let  the  target  correlation  matrix  be  denoted  by  C.  Find  a  matrix  P 
such  that  PP'  «  C.  Again,  we  use  the  Cholesky  factorization  scheme 
because  of  its  relative  simplicity. 

6.  Find  S  -  PQ'l  and  compute  R*  RS ' .  The  ranks  of  the  matrix  R*  (one 
column  at  a  time)  are  the  ranks  we  are  seeking.  Any  set  of  input 
vectors  with  the  same  ranks  as  R*  will  have  a  rank  correlation  matrix 
close  in  value  to  target  correlation  matrix  C. 

Why  does  this  work?  First,  the  regular  sample  correlation  matrix  of  R* 
is  C.  This  is  a  simple  result  that  can  be  shown  with  a  little  matrix 
algebra.  Second,  because  we  started  with  normal  scores,  the  rank  correlation 
coefficients  of  R*  are  usually  numerically  close  to  the  regular  correlation 
coefficients,  given  in  C.  Therefore  any  matrix  with  the  same  ranks  as  R* 
will  have  the  same  rank  correlations  as  R*,  which  should  be  close  to  C. 
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ABSTRACT 

Graphical  methods  for  designing  experiments  have  been  used  since  the  inception  of 
statistical  experiment  design,  yet  this  approach  ha3  received  little  recognition  in  the 
literature.  This  presentation  surveys  historical  uses  of  graphical  displays  and  shows  how 
graphical  representations  can  clarify  the  difference  between  a  bad  design  and  a  good  one. 
Some  practical  rules  for  generating  new  designs  by  graphical  means  are  presented. 


KEYWORDS;  Experiment  Design,  Graphical  Methods 
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I.  INTRODUCTION 


How  can  graphical  tools  be  used  in  the  process  of  designing  an  experiment?  First, 
consider  the  steps  involved  in  experiment  design.  One  can  think  of  this  process  as 
composed  of  five  steps.  These  must  occur  before  any  data  are  collected,  and  before 
statistical  analyses  are  performed.  They  are: 

1 ,  define  the  purpose  of  the  experiment, 

2,  identify  the  independent,  intermediate,  dependent,  and  nuisance  variables, 

3,  classify  the  variables  as  quantitative  or  qualitative,  linear  or  nonlinear  effect 
(independent  variables),  and  fixed  or  varied  during  the  experiment  (independent 
variables), 

4,  using  the  above  information,  choose  or  create  a  design,  and 

3,  validate  the  design. 

This  paper  presents  graphical  methods  for  steps  2, 4,  and  3  of  this  process.  For  step  2, 
we  will  show  Andrews  and  fishbone  diagrams.  Multidimensional  point  plots  and  a 
variety  of  other  techniques  can  be  used  in  step  4.  For  step  3,  we  will  discuss  graphical 
properties  of  good  designs,  and  the  importance  of  checking  projections. 

Because  of  the  high  graphical  content  of  this  presentation,  the  format  of  the  following 
paper  is  unconventional.  Its  form  is  more  like  that  of  an  oral  presentation,  with  figures 
placed  on  the  left  side  of  each  page,  and  the  accompanying  text  on  the  right  (opposite 
each  figure)1.  This  allows  approximately  sixty  figures  to  be  discussed  in  thirty  pages, 
which  might  otherwise  have  taken  twice  the  space. 


1  The  following  pages  come  from  a  session  entitled  "Practical  Graphical  Techniques  for  the  Design  ind 
Analysis  of  Experiments"  presented  by  James  Filliben,  Gerald  Hahn,  and  this  author  at  the  1987  American 
Statistical  Association  Winter  Conference  in  Orlando,  Florida.  These  figrues  are  more  complete  than  the 
Army  Design  of  Experiments  presentation  in  most  ways,  although  some  recent  material  was  presented  in 
Monterey  that  Is  missing  here. 


290 


VIEWGRAPH 


TEXT 


•  OVERVIEW  OF  A  THREE  PART  TALK 

•  ABOUT 20  MINUTES  PER  SECTION 

•  BEGIN  WITH  GRAPHICAL  METHODS 
FOR  DESIGN-NOT  JUST  FOR  VIEWING 
DESIGNS,  BUT  FOR  DESIGNING 
DESIGNS, 

t1! iv A-iVil i i$i', I I'l.Mn ,■  .  ,  :  ...  ..‘S 

I  •  WHY  GRAPHICAL  METI  10I)S?  | 

1  >provide  better  understanding  of  desi  grt  1 J 
i  >make  it  easv  to  generate  n  new  design 
I  >providcs  a  layout  to  run  the  design  irnm 

mitWS^SBXT  v. ,  ,v  •  ;  1# 

•  START  FROM  A  BROAD  CONTEXT! 

WHAT  ARE  THE  EVENTS  LEADINO 
TO  THE  NEED  FOR  AN  EXPERIMENT? 


•  Why  is  the  experiment  necessary? 

•  What  is  known  about  the  system  that  is  being 
investigated? 

•  Mat  are  the  KEY  VARIABLES: 

Independent 

Dependent 

Intermediate 

f  Anticipated  complexity  of  relationships? 

•  Known  constraints  on: 

variabie/factor  values 
experimental  procedure 

•  What  is  the  expected  outcome? 

•  Why  use  GRAPHICAL  methods? 

right-brain,  creative 
powerful,  robust 
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What  do  we  mean  by  GRAPHtCAL  designs? 

Andrews  used  representations  that  were 
graphic  indeed! 

They  convey  more  than  just  the  comhlnninns 
of  factor  levels  that  will  he  tried,  irijijjn  i"u 
the  viewer's  imagination  to  think  riv.m  the 
often  important  details  ns  well  as  the  tn:tut 
structure  (cf  viewgraphs  53&54) 

At  the  first  level  of  experiment  design,  one 
needs  to  view  the  process  that  will  he 
investigated.  This  viewgrnph  shows  the 
representation  Andrews  used  to  plan 
experiments  for  a  meat  processing  operation. 


Source:  Andrews  (1964). 


Ishikawa's  "fishbone"  diagrams:  quicker  to 
draw,  help  to  identify  appropriate 
experiments  to  try. 

Several  forms: 
cause -effect 
process-oriented 
clustered  lists 

A  process-oriented  diagram  for  the  axle 
manufacturing  problem  would  be  organized 
to  have  the  major  process  steps  on  the 
backbone,  with  suoprocesses  hanging  off 
these,  etc.  Causes  of  wobble  would  tend  to 
be  the  outertnost  bones'  on  the  'skeleton' 


Source:  Ishikawa  (1982). 
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Outline 

What  graphical  tools  exist  to  aid  In  designing  experiments? 
What  graphical  concepts  do  these  tools  exploit? 

What  are  the  strengths  and  limitations  ot  graphical  methods? 
What  role  can  computers  play  In  graphical  DOX? 

Summary  -  the  place  of  graphical  methods  In  OOX 


TEXT 


The  following  pages  show  graphical  methods 
to  address  specific  kinds  of  designs,  e.g.  fac¬ 
torial,  lifetest,  etc. 

Greatest  concentration  on  multidimensional 
point  plots  for  factorial  and  fractional 
designs.  Reason:  the  ratio 

practical  value 


current  use 

Basic  outline  of  the  DOX  portion  of  this  talk 
is  at  left 


13.1  A  LIST  OF  CONSTltUCnON  METHODS 

Ttie  following  methods  of  constructing  factorial  designs 
literature: 

(i)  Orthogonal  arrays. 

(ii)  Balanced  arrays. 

(iii)  I  Jilin  squares  and  orthogonal  Latin  squares. 

(i»)  Itadamard  matrices. 

(v)  Unite  geometries. 

(vi)  Confounding. 

(vii)  Group  theory. 

(ifiii)  Algebraic  dccnmpnsiuuil. 

(ix)  Combinatorial  topology. 

(x)  Fotdover. 

(xi)  Collapsing  of  levels. 

(xii)  Composition  (direct  product  and  direct  sum), 
(xiii)  Codes. 

(xiv)  Utock  designs. 

(xv)  F-  squares. 

(xvt)  Weighing  designs. 

(xvti)  Lattice  designs. 

(xvtii)  finite  graphs. 

(itx)  Otte-at-a-itme. 


Graphical  methods  for  DOX  not  recognised 
as  an  entity  historically.  Computerized 
literature  search  gave  ZERO  titles,  keywords 
in  past  10  years  with  both  GRAPHICAL  and 
DOX. 


Source:  Raktoe,  et.  al.  (1981). 
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"Definition  13.1 :  A  txn  matrix  A  with  entries 
from  a  set  S  of  s  symbols  is  called  an 
orthogonal  array  of  size  n,  t  constraints,  s 

levels,  strength  d,  and  index  X  if  any  dxn 
submatrix  of  A  contains  all  sd  possible  dxl 
column  vectors  based  on  s  symbols  of  S 

with  the  same  frequency  X." 


THE  MAIN  POINT:  it  is  easier  to 
understand,  manipulate  and  create  experiment 
designs  when  they  are  represented 
graphically.  Mathematical  descriptions  can 
be  precise,  sometimes  clear,  rarely  easy  to 
manipulate. 


Source:  Raktoe,  et.  al.  (1981). 


-Raktoe,  Heydayat,  and  Federer 


First  volume,  first  paper  in  Tech  nometrics, 
primary  journal  for  examples  of  graphical 
DOX. 

Several  important  concepts  that  will  occur 
again  in  later  viewgraphs: 

1)  designs  decompose  into  subsets 

2)  vertices  of  regular  polyhedra  make  good 
point  subsets 

3)  use  of  point  symbols  to  add  information 
to  the  plot 


source:  DeBaun  (1959). 
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Youdcn's  approach  to  representing  an  in¬ 
complete  design,  circa  1962: 

A  TABLE 


Source:  Youden  (1962). 


296 


GRAPHICAL  DESIGN  OF  EXPERIMENTS  --  R.  BARTON 


VIEWGRAPH 


TEXT 


Youden’s  choice  for  representing  the  same 
design,  circa  1972: 

A  PLOT 


Plot  gives  visual  hints  to  confounding  pattern 
that  can  be  used  not  just  to  display  designs. 

but  to  create  them  as  well. 

Source:  Youden  (1972). 


Box  and  Hunter  used  graphical  models  of 
designs,  and  studied  their  projections  to  find 
ones  with  "balance". 


Source:  Box  and  Hunter  (1961). 
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VIEWGRAPH 
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Here  the  actual  factor  levels  are  used  to  label 
a  24"1  design. 


Source:  Andrews  (1964). 


s. 


"v  Developing  and  understanding  a  graphical  rep¬ 
resentation  for  the  design  can  later  be  aug¬ 
mented  to  display  the  results  of  the  experiment. 

Bubbles  of  this  25"1  design  show  outcomes  of 
experiments. 

Extension:  use  a  symbol  that  conveys  both 
location  AND  spread  at  each  design  point  when 
design  includes  replication  (or  is  an  inner- 
outer  design  a-la  Taguchi). 


Source:  Snee  (1985a). 


So  f*r,  shown  designs  displayed  graphically  10  reveal  pro¬ 
perties.  That  is,  plots  used  DESCRIPTIVELY.  How  to  use 
graphical  methods  to  GENERATE  DESIGNS  for  particular 
applications?  READ  ON - > 


(most  graphical  references  use  plots  for  analysis  or 
presentation,  not  for  design  generation) _ 
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VIEWGRAPH 


TEXT 


•\Jt  is  proved  (Appendix  1)  that  If  a  polynomial  of  any 
degree  of,  is  fitted  by  the  method  of  least  squares  over  any 
region  of  interest  R  In  the  fr  variables,  when  the  true 
function  is  a  polynomial  of  any  degree  >  d,,  then  the  bias 
averaged  over  R  Is  minimized  lor  all  values  of  the 
coefficients  of  the  neglected  terms,  by  making  the  moments 
of  order  cf,+4  and  less  of  the  design  points  equal  to  the 
corresponding  moments  of  a  uniform  distribution  over  ft” 


HOW  TO  GENERATE  DESIGNS 
GRAPHICALLY: 

PRINCIPLE  #1 

< - 

(i.e.  spread  points  out  uniformly  over  space) 


Source:  Box  and  Draper  ( 1959). 


•  Q.E.P.  Box  and  N.R.  Draper 


convenient  to  regard  designs  as  built  up  from  a  number 
of  component  sets  of  points,  each  sat  having  Its  points 
equidistant  from  trie  origin  ...* 

form  the  vertices  of  a  regular  polygon,  polyhedron,  or 
polytope..,* 

-  8ox  and  Hunter  (1957) 


HOW  TO  GENERATE  DESIGNS 
GRAPHICALLY 

PRINCIPLE  #2 

< - 


(if  whole  design  too  complex,  use  divide- 
and*conquer  strategy  to  design  smaller 
components  to  be  combined  --  see  viewgraphs 
30  and  31) 


Source:  Box  and  Hunter  (1957). 
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"Choose  new  points  to  MAXIMIZE  the  minimum 
distance  from  ail  existing  design  points.." 


HOW  TO  GENERATE  DESIGNS 
GRAPHICALLY 

PRINCIPLE  #3 


(this  consideration  arises  from  ''opiinn!" 
design  considerations  --  min  vntinnee  lor  first 
order  model  terms) 


-Kennard  and  Stone  (1969) 


Source:  Kennard  and  Stone  ( 1969). 


SOME  USEFUL  CONCEPTS 
for  generating 
GOOD  DESIGNS 
from 

MULTIDIMENSIONAL  POINT  PLOTS 


Last  point,  used  extensively  by  Box  and 
Hunter,  was  mentioned  earlier. 


1  COVER  THE  DESIGN  SPACE  UNIFORMLY 

2  DECOMPOSE  COMPLICATED  DESIGNS  INTO 
GRAPHICAL  SUBCOMPONENTS 

EXISTING  POINTS  TO  MINIMIZE  VARIANCE 
FOR  FIRST  ORDER  EFFECTS 


4  CHECK J 
ANDLlNl 


1  TO  PLANES 
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Box  and  Draper  findings  above  give  some 
model  independence. 

BUT  using  graphical  methods  to  ccncrnie 
designs  does  not  free  us  from  the  fact: 

DESIGN  GOODNESS  DEPENDS  ON  THE 
TRUE  FORM  OF  THE  MODEL  BEING 
INVESTIGATED. 


Source:  Satterthwaite  (1959). 
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MOM  IUOI  HOC* 3  HOC*  I 


BtOCKEO  0ESK3W3  FROM 

bo*.  Humen  a  humter. 

(pp339-3  4  1 ) 


.4^7;/ 


To  illustrate  multidim.  point  plots  for  design, 
first  show  a  3-factor  experiment  to  be  run  in 
4  blocks  of  2. 

Decomposition,  projection,  and  spanning 
(points  2,  3,  &4)  used  to  generate  good  design 
here.  (Decomposition  is  of  cube  points  into  4 
sets  of  antipodal  pain). 

Block  effects  confounded  with  main  effects  in 
bad  design  seen  from  top  and  rear  projections. 

The  relative  merits  of  these  two  designs  much 
easier  to  see  here  than  in  their  original  (non- 
graphical)  description. 


Source:  Box,  Hunter,  and  Hunter  (1978). 


:02 


C  -  GRAPHICAL  DESIGN  OF  EXPERIMENTS  -  R.  BARTON  ) 


VIEWGRAPH 


TEXT 


This  figure  shows  a  multidim.  point  plot  for 
a  25  factorial  design.  The  blocks  for  the 
design  above  (blocked  designs  from  BH&H) 
set  in  a  row  rather  than  a  square  because 
additional  structure  here  not  present  above. 

Example:  block  effect  (!+4)-(2  i-3'  would 
appear  as  an  "interaction’’  pattern  h<«n\  while 
an  equivalent  pattern,  (l+2)-(3+4l  wouls 
have  a  main  effect  pattern. 


®  © 

®  0 

INTERACTION 

PATTERN 


O  © 

O  © 

MAIN  EFFECT 
PATTERN 


The**  plots  and  those  to  follow  are  easy  to  generate  and 
manipulate  using  a  Macintosh  (MacDraw  ©).  Pro-  jectkms 
are  NOT  automatic,  though. 


Figure  at  left  can  be  used  as  a  template  for 
designs  with  7  or  more  factors. 

Fill  in  subset  of  dots  at  small  cube  vertices  to 
generate  an  incomplete  or  fractional  T  design. 

Use  two  dot  symbols,  e.g.  •  ■ 

fora27  design. 
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TEXT 


Can  do  multidimensional  point  plots  for  2™ 3n 
designs,  too. 

Compare  with  Youden  plot  earlier.  nnd 
viewgraphs  31-34. 


2-way  interaction  patterns 


Second  example  from  literature  is  a  27'2,  frac¬ 
tional  factorial.  Next  three  viewgraphs 
illustrate  the  three  designs  presented  in  the 
reference. 

Value  of  "minimum  aberration"  designs  is 
consonant  with  graphical  design  principles. 

For  2*  designs,  use  decomposition  and  idea 
that  best  fractions  span  the  space:  best  point 
allocation,  therefore,  is  based  on  three  way 
interaction  pattern. 


Reference:  Fries  and  Hunter  (1980). 
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Identical  small-cube  forms  denoted  by 
circles. 

Good  large-cube  pattern. 

Poor  small-cube  pattern  -  can  be  fixed. 

All  projections  can  be  visualized  without 
much  trouble. 

Reference:  Fries  and  Hunter  (1980). 


THE  MINIMUM  ABERRATION  DESIGN 


Pattern  here  is  good;  still  some  flaws  -  the 
choice  of  the  particular  ^small-cube  pattern 
has  a  2-way  pattern  on  the  large  cube,  and  two 
way  pattern  separates  levels  of  f  based  on 
levels  of  d  (I=defg). 

At  this  point,  can  only  push  confounding 
around;  not  enough  design  points  to  fix. 


Reference:  Fries  and  Hunter  (1980). 
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»'LL  0tS/6N 


ACTIONAL  DESIGN 


- \ 

*  ^ 

d  r*« 


Minimum  aberration  and  incomplete  block 
examples  were  from  academic  literature. 

This  example  from  RCA,  industrial  research 
problem.  Design  was  generated  graphically, 
as  shown  here,  for  an  experiment  in  1(,X2. 

Full  factorial  was  a  2  3  . 

Designed  a  1/2  fraction. 


Source:  Barton  (1982). 


The  1/2  fraction  was  composed  of  three  pieces, 
following  DESIGN  PRINCIPLE  #2.  Easy  to 
see  (and  to  design)  this  way. 

Note:  numbers  represent  run  order,  which  was 
modified  in  final  design. 


Source:  Barton  (1982). 
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Bo*  -  Bahnktn  3  *  Fractional  Design 


BOX-BEHNKEN  DESIGN 

Illustrates  use  of  icons  for  complicated  multi¬ 
dimensional  point  plots: 


Reference:  Box  and  Behnken  (1960). 
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Easy  to  generate  alternative  fractions  using  the 
icons;  Bad  Barton  at  left 

Some  properties  of  both  designs  immediately 
obvious: 

no  center 

no  extreme  vertices  (violates  #3) 

Other  properties  (like  why  Bad-Barton  is  bad) 
not  obvious  without  projections. 


•  Oarlon  3*  Fractional  Oaalqn 
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Multidimensional  point  plots  for  factorial 
designs  allow  intuitive  modifications  to 
incorporate  constraints  on  the  design  space. 

Snee  (1981)  gives  rules  used  by  CONS1M  to 
place  mixture  design  points  on  boundaries 
caused  by  constraints.  First  example  in  this 
presentation  of  "mental  graphics". 


Source:  Snee  (1985b). 


For  many  practical  problems,  constraints  are 
few  enough  to  allow  visualization- 

-and  better  control  of  the  design. 


Source:  Kinzer  (1985). 
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FIGURE  6,  Orllio-Xyl«n@  Oxidation  Kinetic  Study— Ex¬ 
perimental  Region  ij  Defined  by  Two  Nonpaiallel  Ptonej 
in  linen  Dimemiom  (luusolo,  Bcicon,  rind  Downie  (1972)1. 


Another  example  showing  constraints 
limiting  the  experimental  region. 


Note:  complex  constraints  may  suggest  a 
transformation  to  the  model  factors. 


Source:  Snee  (1985b). 


FACTORIAL 

MULTIDIMENSIONAL 

PLOTS 

SUMMARY 


Multidim.  point  plots  are  useful  concepts  even 
when  they  can't  actually  be  drawn.  Fry  [J  uses 
"mental  graphics"  to  construct  fractional  2  3 
designs  from  hypersphere  designs  composed 
of  multiple  sets  of  2  designs. 

Why  factorial  (hypercube)? 
answer:  limits  #  of  factor  levels,  easier  to  do 
math,  plot  results,  and  view  design  in 
2-D,  3-D,  etc. 


NEXT  SECTION  REVIEWS  SPECIAL 
METHODS  FOR  RESP.  SURFACE  /  EVOP 
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In  an  early  EVOP  worksheet,  multidim.  point 
plots  for  design  were  part  of  the  data 
collection  worksheet. 

Graphical  design  provides  layout  tn  tun  the 
experiment  from. 


Source:  Box  and  Hunter  (1959). 


A  simplex  plan  that  is  updated  as  runs  are 
completed  can  be  used  to  choose  the  next  run 
point. 

This  is  graphical  sequential  design. 

Easier  if  superimpose  contours  of  model  fitting 
a  recent  subset  of  observations;  see  next 
viewgraph. 


Source:  Hahn, Bemesderfer, and  Olsson  (1986) 
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TEXT 


Here  the  model  is  not  a  polynomial  in  the 
usual  (Taylor  approximation)  sense,  but 
Hardy's  []  interpolation  function. 


Source:  Barton  (1985). 


Reference:  Hardy  (1971). 


Above  trajectory  was  for  a  Nelder-Mead 
simplex  sequential  optimization  strategy. 


Here  are  simplices  of  a  different  sort  for  DOX: 
simplices  arising  from  mixture  experiments. 


The  next  few  viewgraphs  review  graphical 
representations  that  have  been  used  to  create 
and  analyze  mixture  designs. 


Source:  Cornell  (1981). 
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As  for  factorial  designs,  point  plots  can  be 
used  to  identify  subregions  for  study. 


In  addition  to  the  usual  mixture  constraint, 
most  real  mixture  problems  have  additional 
requirements  that  limit  the  design  space. 


Source:  Koons  and  Wilt  (1985). 
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More  complicated  constraints  yield  irregularly 
shaped  regions. 

S nee's  X VERT  program  depends  on  the 
geometric  concepts  of  edges,  vertices,  and  face 
centroids  to  select  "good"  design  points. 

Again,  this  is  "mental  graphics",  since  a 
graphical  image  is  used,  but  it  is  not  actually 
drawn. 


Source:  Snee(1981). 
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Four  factor  mixture  experiments  and 
constrained  subsets  can  be  drawn  effectively, 
and  have  been  used  in  industry. 


Source:  Hare  (1985). 


J 


This  ends  material  on  multidimensional  point 
plots  for  DOX. 

Nomograms  and  graph  paper  graphs  are 
practical  tools  for  DOX,  but  they  are  not  in  the 
spirit  of  earlier  material.  Only  a  brief  sample 
here  to  illustrate  the  kind  of  advantages  they 
offer. 
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Graphical  technique  here  is  one  step  removed 
from  design.  It  represents  a  mathematical 
function  of  the  design  structure. 


Source:  Box  and  Lucas  (1959). 


A  graplncal  aid  lor  O-opliinal  design 

(Bo»  •  lUPW.  1»W) 
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Like  a  nomogram,  this  graph  is  used  to  show 
the  variance  of  maximum  likelihood  estimates 
as  a  function  of  design  parameters. 

The  model  here  is  Arrhenius:  design 
parameters  are  test  temperatures  and  lest 
time.  Censored  observations  are  expected. 


Source:  Nelson  and  Kielpinski  (1975). 


Because  design  properties  are  displayed 
graphically,  it  is  possible  to  optimize  other 
design  properties  (i.e.  other  than  variance  of 
estimates)  by  making  graphical  additions! 

Example:  minimize  the  maximum  test 
temperature  without  exceeding  a  variance  limit 


Source:  Barton  and  Nelson  (1987). 
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NETWORK 

DESIGN 

REPRESENTATIONS 


Like  the  nomograms  and  graph-paper 
graphs,  the  network  design  representations  to 
follow  are  one  level  removed  from  the 
design. 

Because  of  this,  expect  that  they  will  he  less 
useful  for  design  synthesis. 


J 


These  plots,  due  to  Butz,  relate  connectivity  to 
estimable  contrasts. 

For  small  examples,  these  plots  can  be  used  to 
set  up  and  evaluate  designs  for  ANOVA 
models. 


Source:  Butz  (1982). 
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Taguchi  uses  "linear  graphs"  to  expose 
confounding  patterns  in  fractional  designs. 
They  appear  useful  for  choosing  a  defining 
relation  that  yields  a  desired  confounding 
pattern. 

Method  of  construction:  unknown 


Source:  Taguchi  (1980). 
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Cuthbert  Daniel's  method  for  displaying 
confounding  patterns  is  more  difficult  to  see 
(for  me).  Used  to  analyze  rather  than  generate. 


Source:  Daniel  (1962). 
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Graphical  representations  of  hierarchy  help  to 
develop  nested  designs  for  mixed  and 
random  effects  models. 

Andrews  was  particularly  graphic. 


Source:  Andrews  (1964). 


A  simpler,  perhaps  less  informative 
representation  of  the  same  design.  This  form 
has  been  used  by  several  authors. 
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See  also:  Leone, NeIson,and  Johnson  (1968) 
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What  graphical  concepts  do  these 

tools  exploit? 


1  Design  Balance/Symmetry 

2  Design  Projections 

3  "Face"  Incidence  of  Design  Points 

4  Network  properties:  connectedness, 
etc. 

5  Analog  Computations 


^ _ : _ J 


© - ' 

What  are  the  strengths  and  limitations 
of  graphical  methods? 

+  Flexible 

•  make  tradeoffs  visually 

•  incorporate  constraints  graphically 

+  Robust 

+  Uses  powerful  computer  -  human  eye 
+  Graphical  DOX  methods  easy  to  use  & 
remember 

-  Non-quantitative 

-  Dimensional  limitations 

V _ ) 
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Of  course,  computers  play  other  roles  in  DOX, 
e.g.  DETMAX.  Here  we  mean  getting 
computers  to  help  with  the  plotting, 
projections,  views,  etc. 


What  rola  can  computers  play  In  graphical  DOX? 


1  Make  descriptive  tools  into  prescriptive  ones 

•rapid  plotting  o(  alternative  designs 
•exhaustive  plots  ol  alternative  designs  for  scanning 

2  Interactive  graphics 

•real  time  design  manipulation 

•computed  design  properties  updated  and  displayed 

3  Rule-based  systems  to  manipulate  geometric  or  network  objects 


V 


*\ 


Even  for  DETMAX  applications,  graphical 
methods  resorted  to  for  understanding  and 
evaluation. 


Source:  Mitchell  (1974). 
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Examples  of  graphics  shown  here  aren't  meant 
to  be  prescriptive;  graphical  DOX  as  a  distinct 
entity  is  too  new. 

This  selection  represents  useful  methods  to 
trigger  your  own  imagination. 

Try  to  find  useful  ways  to  handle  designs  with 
many  factors. 

USE  YOUR  RIGHT  BRAIN 
(and  may  the  force  be  with  you!) 


Reference:  Box  (1984). 
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VIEWGRAPH 


Summary  -  the  place  ot  graphical  methods  in  OOX 

1  Graphical:  Investigative ,  creative 

2  Mathematical.  Computer-Aided:  confirmatory 


NEXT: 


GRAPHICAL 

ANALYSIS 


v. _ _ _ j 
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